This is a RMarkdown document that will be used for the 20230409 morning sessions to go through together. Some of the objectives are:
cellrangerWe assume that you have installed the latest R on your laptop (currently R 4.2.3), and also updated to the latest RStudio (in my case it is 2023.03.0+386 (2023.03.0+386)).
The following code ensures that the packages that I am installing are placed on a defined directory
.libPaths("~/R_xenopus")
.libPaths()
## [1] "/Users/chlee/R_xenopus"
## [2] "/Library/Frameworks/R.framework/Versions/4.2/Resources/library"
The following code installs Bioconductor package
manager. eval=FALSE ensures that it does not run two times
during RMarkdown generation.
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(version = "3.16")
Now let us install Seurat, but one of the strength of R
comes from sets of packages developed by the RStudio group:
tidyverse, so let’s install this as well (you may have it
installed already). And I want to add one more small package
tictoc that is handy in measuring how long it took to run a
patch of code.
During this installation run, which will take few minutes, it asks
whether igraph package should be compiled in the system. At
least in Mac OSX (on 2023-04-07), this fails, so so do not compile
igraph but just use the older pre-compiled version
instead.
BiocManager::install(c("tidyverse", "Seurat", "tictoc", "devtools") )
# See this: https://github.com/Toniiiio/imageclipr
# devtools::install_github('Timag/imageclipr')
The following code ensures that the packages are all up-to-date. Note
that igraph package is out of date, but this is OK, leave
it.
BiocManager::valid()
## Warning: 1 packages out-of-date; 0 packages too new
##
## * sessionInfo()
##
## R version 4.2.3 (2023-03-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.31 R6_2.5.1 jsonlite_1.8.4
## [4] evaluate_0.20 cachem_1.0.7 rlang_1.1.0
## [7] cli_3.6.1 rstudioapi_0.14 jquerylib_0.1.4
## [10] bslib_0.4.2 rmarkdown_2.21 tools_4.2.3
## [13] xfun_0.38 yaml_2.3.7 fastmap_1.1.1
## [16] compiler_4.2.3 BiocManager_1.30.20 htmltools_0.5.5
## [19] knitr_1.42 sass_0.4.5
##
## Bioconductor version '3.16'
##
## * 1 packages out-of-date
## * 0 packages too new
##
## create a valid installation with
##
## BiocManager::install("igraph", update = TRUE, ask = FALSE, force = TRUE)
##
## more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date
The above BiocManager::valid() run already ran
sessionInfo() but for all R runs, please include this for
reproducibility purposes. This lists all the R packages installed in the
system (as directed by .libPaths()) with all the versions,
so you can track for any issues of reproducibility here.
sessionInfo()
## R version 4.2.3 (2023-03-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.31 R6_2.5.1 jsonlite_1.8.4
## [4] evaluate_0.20 cachem_1.0.7 rlang_1.1.0
## [7] cli_3.6.1 rstudioapi_0.14 jquerylib_0.1.4
## [10] bslib_0.4.2 rmarkdown_2.21 tools_4.2.3
## [13] xfun_0.38 yaml_2.3.7 fastmap_1.1.1
## [16] compiler_4.2.3 BiocManager_1.30.20 htmltools_0.5.5
## [19] knitr_1.42 sass_0.4.5
Double check where you are. This gives you a sense when you want to use relative URLs later:
getwd() # Usually the Document directory
## [1] "/Users/chlee/Dropbox (HMS)/tabinLab/presentation/20230408(XenopusBioinfo2023)/20230409"
here::here() # Usually the project directory
## [1] "/Users/chlee/Dropbox (HMS)/tabinLab/presentation/20230408(XenopusBioinfo2023)"
(It should be the project folder on your top right corner of RStudio)
First you load the libraries. These commands will let you use their open functions without invoking the package names. Also it might be useful to set up a project prefix, such that all the intermediate files can be tracked more efficiently.
library(tidyverse) # we mostly use dplyr library
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.1 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(Seurat)
## Attaching SeuratObject
library(patchwork)
theme_set( theme_bw() )
root.dir <- here::here()
"%ni%" <- Negate("%in%")
project.prefix <- "20230409_"
sessionInfo()
## R version 4.2.3 (2023-03-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] patchwork_1.1.2 SeuratObject_4.1.3 Seurat_4.3.0 lubridate_1.9.2
## [5] forcats_1.0.0 stringr_1.5.0 dplyr_1.1.1 purrr_1.0.1
## [9] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.2
## [13] tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] Rtsne_0.16 colorspace_2.1-0 deldir_1.0-6
## [4] ellipsis_0.3.2 ggridges_0.5.4 rprojroot_2.0.3
## [7] spatstat.data_3.0-1 rstudioapi_0.14 leiden_0.4.3
## [10] listenv_0.9.0 ggrepel_0.9.3 fansi_1.0.4
## [13] codetools_0.2-19 splines_4.2.3 cachem_1.0.7
## [16] knitr_1.42 polyclip_1.10-4 jsonlite_1.8.4
## [19] ica_1.0-3 cluster_2.1.4 png_0.1-8
## [22] uwot_0.1.14 spatstat.sparse_3.0-1 shiny_1.7.4
## [25] sctransform_0.3.5 BiocManager_1.30.20 compiler_4.2.3
## [28] httr_1.4.5 Matrix_1.5-4 fastmap_1.1.1
## [31] lazyeval_0.2.2 cli_3.6.1 later_1.3.0
## [34] htmltools_0.5.5 tools_4.2.3 igraph_1.4.1
## [37] gtable_0.3.3 glue_1.6.2 reshape2_1.4.4
## [40] RANN_2.6.1 Rcpp_1.0.10 scattermore_0.8
## [43] jquerylib_0.1.4 vctrs_0.6.1 nlme_3.1-162
## [46] spatstat.explore_3.1-0 progressr_0.13.0 lmtest_0.9-40
## [49] spatstat.random_3.1-4 xfun_0.38 globals_0.16.2
## [52] timechange_0.2.0 mime_0.12 miniUI_0.1.1.1
## [55] lifecycle_1.0.3 irlba_2.3.5.1 goftest_1.2-3
## [58] future_1.32.0 MASS_7.3-58.3 zoo_1.8-11
## [61] scales_1.2.1 spatstat.utils_3.0-2 hms_1.1.3
## [64] promises_1.2.0.1 parallel_4.2.3 RColorBrewer_1.1-3
## [67] yaml_2.3.7 gridExtra_2.3 reticulate_1.28
## [70] pbapply_1.7-0 sass_0.4.5 stringi_1.7.12
## [73] rlang_1.1.0 pkgconfig_2.0.3 matrixStats_0.63.0
## [76] evaluate_0.20 lattice_0.21-8 tensor_1.5
## [79] ROCR_1.0-11 htmlwidgets_1.6.2 cowplot_1.1.1
## [82] tidyselect_1.2.0 here_1.0.1 parallelly_1.35.0
## [85] RcppAnnoy_0.0.20 plyr_1.8.8 magrittr_2.0.3
## [88] R6_2.5.1 generics_0.1.3 DBI_1.1.3
## [91] pillar_1.9.0 withr_2.5.0 fitdistrplus_1.1-8
## [94] abind_1.4-5 survival_3.5-5 sp_1.6-0
## [97] future.apply_1.10.0 KernSmooth_2.23-20 utf8_1.2.3
## [100] spatstat.geom_3.1-0 plotly_4.10.1 tzdb_0.3.0
## [103] rmarkdown_2.21 grid_4.2.3 data.table_1.14.8
## [106] digest_0.6.31 xtable_1.8-4 httpuv_1.6.9
## [109] munsell_0.5.0 viridisLite_0.4.1 bslib_0.4.2
If you want to use a specific function from a package you did NOT
load by library command, you can always use
[library name]::[function name] which I am going to do in a
minute with tictoc library:
tictoc::tic() # this is a function from the tictoc package
tictoc::toc() # this is a function from the tictoc package
## 0.001 sec elapsed
Why is it (sometimes) important? Sometimes, if you load too many libraries, depending on the order, some functions from different packages with identical names can be overridden, and you create an ambiguity what function to use, so it might be necessary to specify where the function is coming from.
Konrad has very useful functions that you can also load. Notice after running this code chunk the changes in environment (typically top right panel):
# Not necessary for the practice, but useful
source("https://raw.githubusercontent.com/xenbase-hub/workshop/main/toolbox.R")
Now let’s read a 10X cellranger generated count matrix.
Seurat has a handy function called Read10X to
load the data into a sparse matrix format (dgCMatrix). In
the class, we will check where these data are coming from, but it
suffices to provide the directory where the necessary files are
present:
tictoc::tic()
xenopus.data <- Read10X(data.dir = "./scCapSt27_count/outs/filtered_gene_bc_matrices/XENLA_GCA001663975v1_XBv9p2/")
# xenopus.data <- Read10X(data.dir = "./scCapSt27_xen10_1_20230408/outs/filtered_feature_bc_matrix/")
tictoc::toc()
## 1.826 sec elapsed
class(xenopus.data)
## [1] "dgCMatrix"
## attr(,"package")
## [1] "Matrix"
head(xenopus.data[,1:30]) # only the first 30 cellular barcodes
## 6 x 30 sparse Matrix of class "dgCMatrix"
## [[ suppressing 30 column names 'AAACCTGAGCTATGCT-1', 'AAACCTGAGGGTTCCC-1', 'AAACCTGCAGATGGGT-1' ... ]]
##
## gene25011|Xelaev18004747m . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene21250|Xetrov90028798m.L . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene27977|Xelaev18004749m . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene26149|Xelaev18004750m . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene25611|Xelaev18004751m . . . . . . . . . . . . . . . . . . . . . . . . . .
## gene30800|Xelaev18004752m . . . . . . . . . . . . . . . . . . . . . . . . . .
##
## gene25011|Xelaev18004747m . . . .
## gene21250|Xetrov90028798m.L . . . .
## gene27977|Xelaev18004749m . . . .
## gene26149|Xelaev18004750m . . . .
## gene25611|Xelaev18004751m . . . .
## gene30800|Xelaev18004752m . . . .
For this matrix, the rows represent genes(features), and the columns represent cellular barcodes. You can have a peek of how the cell names are represented:
head( colnames(xenopus.data) )
## [1] "AAACCTGAGCTATGCT-1" "AAACCTGAGGGTTCCC-1" "AAACCTGCAGATGGGT-1"
## [4] "AAACCTGCAGCCACCA-1" "AAACCTGGTCTGCCAG-1" "AAACCTGGTTCACGGC-1"
nchar("AAACCTGAGCTATGCT-1")
## [1] 18
This is typical output from cellranger where 16bp
barcode sequence is suffixed with -1.
Now let’s make a SeuratObject that is used in Seurat
package. With some parameters, you can already do some filtering steps
here.
xenopus <- CreateSeuratObject(
counts = xenopus.data, # Here you put your count matrix
project = "XenopusBioInfo2023", # This is just a handy name attached to the object
min.cells = 3, # At least 3 cells should have a particular gene expressed
min.features = 200 # At least a cell should have 200 genes detected to be included
)
## Warning: Feature names cannot have underscores ('_'), replacing with dashes
## ('-')
## Warning: Feature names cannot have pipe characters ('|'), replacing with dashes
## ('-')
It is often helpful to pay attention to the warning signs that arise. Here you have two warning messages. Let’s check what it means.
First, “Feature names cannot have underscores”. Are there genes that have underscores?
# This is a common UNIX command that is appropriated to R
grep("_", rownames(xenopus.data), value = T)
## [1] "gene16511|car1_predicted.S" "gene134|hes5_X2.L"
## [3] "gene9235|hes5_X1.L" "gene13524|hes5_X2.S"
Yes, there are four gene names that contain a underscore.
hes5 sounds familiar, want to check whether there were any
issues with this gene:
grep("hes5", rownames(xenopus.data), value = T)
## [1] "gene19724|hes5.2.L" "gene18133|hes5.1.L" "gene134|hes5_X2.L"
## [4] "gene9235|hes5_X1.L" "gene34361|hes5.2.S" "gene37268|hes5.1.S"
## [7] "gene13524|hes5_X2.S"
As you can see, there are 7 different feature names associated with
hes5, more than the usual L and S forms. It might be
helpful to go back and see whether they represent something in the
JBrowser.
Let’s check the feature/gene names in the original count matrix loaded:
rownames(xenopus.data) %>% head()
## [1] "gene25011|Xelaev18004747m" "gene21250|Xetrov90028798m.L"
## [3] "gene27977|Xelaev18004749m" "gene26149|Xelaev18004750m"
## [5] "gene25611|Xelaev18004751m" "gene30800|Xelaev18004752m"
As you can see, the cellranger generated gene names have
a format that contains “|”. How many are there?
grep("|", rownames(xenopus.data), value = T) %>% length()
## [1] 41560
nrow(xenopus.data)
## [1] 41560
So the entire genes are named with this format, so with a warning,
importing this count matrix to Seurat object the
CreateSeuratObject function did the following:
grep("hes5", rownames(xenopus), value = T)
## [1] "gene134-hes5-X2.L"
There are two things here - one that that
gene134|hes5_X2.L characters of “_” and “|” are all
replaced to “-”.
Quiz: Where are the other 6 hes5 genes that were found
in the original sparse count matrix?
You can answer this here (by changing the Markdown file).
We can also check the changes of cell numbers here during the import:
ncol(xenopus.data)
## [1] 5263
ncol(xenopus)
## [1] 5085
For most of the standard workflows of scRNA-seq analysis, you are interested in a category of genes that are together. One important category that is almost always presented in tutorials are the mitochondrial genes. They are special in that their RNA source is in a different subcellular compartment. Dying cells tend to have more enriched fraction of these mitochondrial genes to other nuclear genes. Do you have genes in this count matrix?
Let’s guess (which many tutorials do) whether a usual name is present in the gene list:
grep("cytb", rownames(xenopus.data), value = T)
## character(0)
grep("CYTB", rownames(xenopus.data), value = T)
## character(0)
The best way would be to go back to the reference annotation you used to build the STAR index that cell ranger used to generate the count matrix.
Quiz: Can you identify the meta information that you can retrieve from the GEO to check whether mitochondrial chromosome is present and identify the mitochondrial gene names?
Below are some potential way to extract quickly the present chromosomal information as well as the gene names when given a GTF file (here the GTF file is from our most up-to-date Xenla10.1)
# Example code in console. You could potentially use chunk header bash instead of r to run this also in the document.
# For linux
# zcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '{ print $1 }' | uniq
# For Mac OSX
# gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '{ print $1 }' | uniq
# gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '( $1 == "chrM" )'
gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '{ print $1 }' | uniq
Do your own work
grep("44447", rownames(xenopus.data), value = T)
## [1] "gene44447|LOC108708778"
As I presented yesterday, I have retrieved the archived SRA.lite
files and generated FASTQ files to run cellranger (6.0.1)
again to generate count matrices with the reference that contains
mitochondrial genes.
Because of the version difference of cellranger to
generate the count matrices is different from the original ones (Quiz:
what version is it?), you have a slightly different directory structure
for loading. Let’s re-do all the steps we did in preparation for the
standard workflow
One side comment:
It is generally not recommended to re-use (override) same variable names for the reproducibility’s sake - one practice is to clean up this RMarkdown file for a final version which does not make the detour of loading the original count matrix, or make more explicit rules to track the variable names associated for a particular dataset. Here, for the sake of being explicit, we will clean up the previous variables and override.
rm(xenopus.data)
rm(xenopus)
tictoc::tic()
xenopus.data <- Read10X(data.dir = "./scCapSt27_xen10_1_20230408/outs/filtered_feature_bc_matrix/")
tictoc::toc()
## 1.432 sec elapsed
xenopus <- CreateSeuratObject(
counts = xenopus.data, # Here you put your count matrix
project = "XenopusBioInfo2023", # This is just a handy name attached to the object
min.cells = 3, # At least 3 cells should have a particular gene expressed
min.features = 200 # At least a cell should have 200 genes detected to be included
)
## Warning: Feature names cannot have underscores ('_'), replacing with dashes
## ('-')
So again, worthwhile to dig in a bit:
# This is a common UNIX command that is appropriated to R
grep("_", rownames(xenopus.data), value = T)
## [1] "trnar-acg_1" "trnar-acg_2" "trnav-cac_1"
## [4] "trnae-cuc_1" "trnav-aac_1" "trnar-ccu_1"
## [7] "trnar-acg_3" "trnar-acg_4" "trnae-cuc_2"
## [10] "trnav-aac_2" "trnar-acg_5" "trnav-aac_3"
## [13] "trnar-acg_6" "trnav-cac_2" "trnav-aac_4"
## [16] "trnav-cac_3" "trnae-cuc_3" "trnah-gug_1"
## [19] "trnav-aac_5" "trnae-cuc_4" "trnav-cac_4"
## [22] "trnar-acg_7" "trnah-gug_2" "trnav-aac_6"
## [25] "trnae-cuc_5" "trnav-cac_5" "trnar-acg_8"
## [28] "trnar-ccu_2" "trnah-gug_3" "trnav-aac_7"
## [31] "trnae-cuc_6" "trnar-acg_9" "trnar-ccu_3"
## [34] "trnav-aac_8" "trnae-cuc_7" "trnav-cac_6"
## [37] "trnar-acg_10" "trnar-ccu_4" "trnah-gug_4"
## [40] "trnav-aac_9" "trnae-cuc_8" "trnav-cac_7"
## [43] "trnar-acg_11" "trnar-ccu_5" "trnah-gug_5"
## [46] "trnav-cac_8" "trnar-ccu_6" "trnah-gug_6"
## [49] "trnav-aac_10" "trnae-cuc_9" "trnav-cac_9"
## [52] "trnah-aug_1" "trnar-ccu_7" "trnah-gug_7"
## [55] "trnav-aac_11" "trnae-cuc_10" "trnav-cac_10"
## [58] "trnar-acg_12" "trnar-ccu_8" "trnav-aac_12"
## [61] "trnae-cuc_11" "trnar-acg_13" "trnar-ccu_9"
## [64] "trnav-cac_11" "trnar-acg_14" "trnah-gug_8"
## [67] "trnav-aac_13" "trnae-cuc_12" "trnav-cac_12"
## [70] "trnar-acg_15" "trnar-ccu_10" "trnah-gug_9"
## [73] "trnav-aac_14" "trnae-cuc_13" "trnav-cac_13"
## [76] "trnar-acg_16" "trnar-ccu_11" "trnah-gug_10"
## [79] "trnav-aac_15" "trnae-cuc_14" "trnav-cac_14"
## [82] "trnar-acg_17" "trnar-ccu_12" "trnah-gug_11"
## [85] "trnav-aac_16" "trnae-cuc_15" "trnav-cac_15"
## [88] "trnar-acg_18" "trnar-ccu_13" "trnae-cuc_16"
## [91] "trnav-cac_16" "trnar-acg_19" "trnar-ccu_14"
## [94] "trnah-gug_12" "trnav-aac_17" "trnae-cuc_17"
## [97] "trnav-cac_17" "trnar-acg_20" "trnar-ccu_15"
## [100] "trnah-gug_13" "trnav-aac_18" "trnae-cuc_18"
## [103] "trnav-cac_18" "trnar-acg_21" "trnar-ccu_16"
## [106] "trnah-gug_14" "trnav-aac_19" "trnae-cuc_19"
## [109] "trnav-cac_19" "trnar-acg_22" "trnar-ccu_17"
## [112] "trnav-aac_20" "trnar-acg_23" "trnar-ccu_18"
## [115] "trnah-gug_15" "trnav-aac_21" "trnae-cuc_20"
## [118] "trnav-cac_20" "trnar-acg_24" "trnar-ccu_19"
## [121] "trnah-gug_16" "trnav-aac_22" "trnae-cuc_21"
## [124] "trnav-cac_21" "trnar-acg_25" "trnar-ccu_20"
## [127] "trnah-gug_17" "trnav-aac_23" "trnae-cuc_22"
## [130] "trnav-cac_22" "trnar-acg_26" "trnar-ccu_21"
## [133] "trnah-gug_18" "trnav-aac_24" "trnae-cuc_23"
## [136] "trnav-cac_23" "trnar-acg_27" "trnar-ccu_22"
## [139] "trnah-gug_19" "trnav-aac_25" "trnae-cuc_24"
## [142] "trnav-cac_24" "trnar-acg_28" "trnar-ccu_23"
## [145] "trnah-gug_20" "trnav-aac_26" "trnae-cuc_25"
## [148] "trnav-cac_25" "trnar-acg_29" "trnar-ccu_24"
## [151] "trnah-gug_21" "trnav-aac_27" "trnae-cuc_26"
## [154] "trnav-cac_26" "trnar-acg_30" "trnar-ccu_25"
## [157] "trnah-gug_22" "trnav-aac_28" "trnae-cuc_27"
## [160] "trnav-cac_27" "trnar-acg_31" "trnar-ccu_26"
## [163] "trnah-gug_23" "trnav-cac_28" "trnar-acg_32"
## [166] "trnar-ccu_27" "trnah-gug_24" "trnav-aac_29"
## [169] "trnav-cac_29" "trnav-aac_30" "trnav-cac_30"
## [172] "trnar-acg_33" "trnav-cac_31" "trnae-cuc_28"
## [175] "trnae-cuc_29" "trnav-aac_31" "trnav-aac_32"
## [178] "trnav-cac_32" "trnar-ccu_28" "trnah-gug_25"
## [181] "trnav-aac_33" "trnae-cuc_30" "trnav-cac_33"
## [184] "trnar-acg_34" "trnah-gug_26" "trnav-aac_34"
## [187] "trnae-cuc_31" "trnav-cac_34" "trnar-acg_35"
## [190] "trnar-ccu_29" "trnav-aac_35" "trnae-cuc_32"
## [193] "trnav-cac_35" "trnar-acg_36" "trnar-ccu_30"
## [196] "trnah-gug_27" "trnav-aac_36" "trnae-cuc_33"
## [199] "trnav-cac_36" "trnar-acg_37" "trnah-gug_28"
## [202] "trnav-aac_37" "trnah-gug_29" "trnav-aac_38"
## [205] "trnae-cuc_34" "trnav-cac_37" "trnar-acg_38"
## [208] "trnah-gug_30" "trnav-aac_39" "trnae-cuc_35"
## [211] "trnav-cac_38" "trnar-acg_39" "trnar-ccu_31"
## [214] "trnah-gug_31" "trnav-aac_40" "trnae-cuc_36"
## [217] "trnav-cac_39" "trnar-acg_40" "trnar-ccu_32"
## [220] "trnah-gug_32" "trnav-aac_41" "trnae-cuc_37"
## [223] "trnav-cac_40" "trnar-acg_41" "trnar-ccu_33"
## [226] "trnar-acg_42" "trnar-ccu_34" "trnah-gug_33"
## [229] "trnav-aac_42" "trnah-aug_2" "trnar-ccu_35"
## [232] "trnah-gug_34" "trnav-aac_43" "trnae-cuc_38"
## [235] "trnav-cac_41" "trnar-acg_43" "trnar-ccu_36"
## [238] "trnah-gug_35" "trnav-aac_44" "trnae-cuc_39"
## [241] "trnav-cac_42" "trnar-acg_44" "trnar-ccu_37"
## [244] "trnah-gug_36" "trnav-aac_45" "trnae-cuc_40"
## [247] "trnav-cac_43" "trnar-acg_45" "trnar-ccu_38"
## [250] "trnak-cuu_1" "trnae-cuc_41" "trnah-gug_37"
## [253] "trnay-gua_1" "trnar-acg_46" "trnap-ugg_1"
## [256] "trnak-cuu_2" "trnaa-ugc_1" "trnad-guc_1"
## [259] "trnay-gua_2" "trnag-ucc_1" "trnae-cuc_42"
## [262] "trnay-gua_3" "trnar-acg_47" "trnat-ugu_1"
## [265] "trnap-agg_1" "trnap-ugg_2" "trnaa-ugc_2"
## [268] "trnad-guc_2" "trnaf-gaa_1" "trnad-guc_3"
## [271] "trnap-agg_2" "trnap-agg_3" "trnad-guc_4"
## [274] "trnap-agg_4" "trnap-agg_5" "trnap-agg_6"
## [277] "trnaw-cca_1" "trnad-guc_5" "trnad-guc_6"
## [280] "trnai-aau_1" "trnai-aau_2" "trnad-guc_7"
## [283] "trnai-aau_3" "trnad-guc_8" "trnai-aau_4"
## [286] "trnad-guc_9" "trnai-aau_5" "trnai-aau_6"
## [289] "trnad-guc_10" "trnaw-cca_2" "trnaw-cca_3"
## [292] "trnaw-cca_4" "trnaw-cca_5" "trnaw-cca_6"
## [295] "trnaw-cca_7" "trnaw-cca_8" "trnaw-cca_9"
## [298] "trnaw-cca_10" "trnaw-cca_11" "trnaw-cca_12"
## [301] "trnaw-cca_13" "trnag-ccc_1" "trnag-ccc_2"
## [304] "trnag-ccc_3" "trnag-ccc_4" "trnag-ccc_5"
## [307] "trnag-ccc_6" "trnag-ccc_7" "trnag-ccc_8"
## [310] "trnag-ccc_9" "trnag-ccc_10" "trnag-ccc_11"
## [313] "trnag-ccc_12" "trnag-ccc_13" "trnag-ccc_14"
## [316] "trnag-ccc_15" "trnag-ccc_16" "trnag-ccc_17"
## [319] "trnag-ccc_18" "trnag-ccc_19" "trnag-ccc_20"
## [322] "trnag-ccc_21" "trnae-uuc_1" "trnav-aac_46"
## [325] "trnap-cgg_1" "trnap-agg_7" "trnap-agg_8"
## [328] "trnav-uac_1" "trnap-agg_9" "trnap-agg_10"
## [331] "trnap-agg_11" "trnal-uaa_1" "trnal-uaa_2"
## [334] "trnal-uaa_3" "trnal-uaa_4" "trnal-uaa_5"
## [337] "trnal-uaa_6" "trnal-uaa_7" "trnal-uaa_8"
## [340] "trnal-uaa_9" "trnal-uaa_10" "trnal-uaa_11"
## [343] "trnal-uaa_12" "trnal-uaa_13" "trnak-cuu_3"
## [346] "trnaw-cca_14" "trnaw-cca_15" "trnak-cuu_4"
## [349] "trnak-cuu_5" "trnag-gcc_1" "trnak-cuu_6"
## [352] "trnag-gcc_2" "trnak-cuu_7" "trnag-gcc_3"
## [355] "trnak-cuu_8" "trnag-gcc_4" "trnak-cuu_9"
## [358] "trnag-gcc_5" "trnak-cuu_10" "trnag-gcc_6"
## [361] "trnak-cuu_11" "trnak-cuu_12" "trnag-gcc_7"
## [364] "trnak-cuu_13" "trnag-gcc_8" "trnak-cuu_14"
## [367] "trnag-gcc_9" "trnag-gcc_10" "trnag-gcc_11"
## [370] "trnak-cuu_15" "trnad-guc_11" "trnac-gca_1"
## [373] "trnae-uuc_2" "trnas-cga_1" "trnae-uuc_3"
## [376] "trnan-guu_1" "trnav-aac_47" "trnav-aac_48"
## [379] "trnaq-cug_1" "trnas-aga_1" "trnas-aga_2"
## [382] "trnas-uga_1" "trnas-aga_3" "trnas-uga_2"
## [385] "trnaq-cug_2" "trnas-aga_4" "trnas-uga_3"
## [388] "trnas-uga_4" "trnak-cuu_16" "trnag-gcc_12"
## [391] "trnak-cuu_17" "trnag-gcc_13" "trnak-cuu_18"
## [394] "trnag-gcc_14" "trnak-cuu_19" "trnag-gcc_15"
## [397] "trnak-cuu_20" "trnag-gcc_16" "trnak-cuu_21"
## [400] "trnag-gcc_17" "trnak-cuu_22" "trnag-gcc_18"
## [403] "trnak-cuu_23" "trnag-gcc_19" "trnak-cuu_24"
## [406] "trnag-gcc_20" "trnak-cuu_25" "trnag-gcc_21"
## [409] "trnak-cuu_26" "trnag-gcc_22" "trnak-cuu_27"
## [412] "trnag-gcc_23" "trnak-cuu_28" "trnag-gcc_24"
## [415] "trnak-cuu_29" "trnag-gcc_25" "trnak-cuu_30"
## [418] "trnag-gcc_26" "trnak-cuu_31" "trnag-gcc_27"
## [421] "trnak-cuu_32" "trnag-gcc_28" "trnak-cuu_33"
## [424] "trnag-gcc_29" "trnak-cuu_34" "trnag-gcc_30"
## [427] "trnak-cuu_35" "trnag-gcc_31" "trnak-cuu_36"
## [430] "trnag-gcc_32" "trnak-cuu_37" "trnag-gcc_33"
## [433] "trnak-cuu_38" "trnag-gcc_34" "trnak-cuu_39"
## [436] "trnag-gcc_35" "trnak-cuu_40" "trnag-gcc_36"
## [439] "trnak-cuu_41" "trnak-cuu_42" "trnag-gcc_37"
## [442] "trnak-cuu_43" "trnag-gcc_38" "trnak-cuu_44"
## [445] "trnag-gcc_39" "trnak-cuu_45" "trnag-gcc_40"
## [448] "trnak-cuu_46" "trnag-gcc_41" "trnak-cuu_47"
## [451] "trnag-gcc_42" "trnas-uga_5" "trnal-uaa_14"
## [454] "trnag-ccc_22" "trnag-ccc_23" "trnag-ccc_24"
## [457] "trnag-ccc_25" "trnag-ccc_26" "trnag-ccc_27"
## [460] "trnag-ccc_28" "trnag-ccc_29" "trnag-ccc_30"
## [463] "trnat-ugu_2" "trnad-guc_12" "trnap-ugg_3"
## [466] "trnap-agg_12" "trnah-gug_38" "trnah-gug_39"
## [469] "trnak-uuu_1" "trnai-aau_7" "trnad-guc_13"
## [472] "trnad-guc_14" "trnad-guc_15" "trnad-guc_16"
## [475] "trnad-guc_17" "trnad-guc_18" "trnad-guc_19"
## [478] "trnad-guc_20" "trnad-guc_21" "trnad-guc_22"
## [481] "trnad-guc_23" "trnad-guc_24" "trnad-guc_25"
## [484] "trnad-guc_26" "trnad-guc_27" "trnad-guc_28"
## [487] "trnad-guc_29" "trnad-guc_30" "trnad-guc_31"
## [490] "trnad-guc_32" "trnad-guc_33" "trnad-guc_34"
## [493] "trnad-guc_35" "trnad-guc_36" "trnad-guc_37"
## [496] "trnad-guc_38" "trnad-guc_39" "trnad-guc_40"
## [499] "trnad-guc_41" "trnad-guc_42" "trnad-guc_43"
## [502] "trnad-guc_44" "trnad-guc_45" "trnad-guc_46"
## [505] "trnad-guc_47" "trnad-guc_48" "trnad-guc_49"
## [508] "trnad-guc_50" "trnad-guc_51" "trnad-guc_52"
## [511] "trnad-guc_53" "trnad-guc_54" "trnar-acg_48"
## [514] "trnav-cac_44" "trnaq-cug_3" "trnak-cuu_48"
## [517] "trnae-uuc_4" "trnag-ccc_31" "trnap-ugg_4"
## [520] "trnap-ugg_5" "trnaq-uug_1" "trnaq-uug_2"
## [523] "trnas-aga_5" "trnaq-uug_3" "trnas-aga_6"
## [526] "trnaq-uug_4" "trnas-aga_7" "trnaq-uug_5"
## [529] "trnaq-uug_6" "trnaq-uug_7" "trnaq-cug_4"
## [532] "trnaq-uug_8" "trnas-aga_8" "trnaq-cug_5"
## [535] "trnas-aga_9" "trnaq-cug_6" "trnaq-uug_9"
## [538] "trnas-aga_10" "trnaq-cug_7" "trnaq-uug_10"
## [541] "trnas-aga_11" "trnaq-cug_8" "trnaq-uug_11"
## [544] "trnas-aga_12" "trnaq-cug_9" "trnaq-uug_12"
## [547] "trnas-aga_13" "trnaq-cug_10" "trnaq-uug_13"
## [550] "trnas-aga_14" "trnaq-cug_11" "trnaq-uug_14"
## [553] "trnas-aga_15" "trnaq-cug_12" "trnaq-uug_15"
## [556] "trnas-aga_16" "trnaq-cug_13" "trnaq-uug_16"
## [559] "trnas-aga_17" "trnaq-cug_14" "trnaq-uug_17"
## [562] "trnas-aga_18" "trnaq-cug_15" "trnaq-uug_18"
## [565] "trnas-aga_19" "trnaq-cug_16" "trnaq-uug_19"
## [568] "trnas-aga_20" "trnaq-cug_17" "trnaq-uug_20"
## [571] "trnas-aga_21" "trnaq-uug_21" "trnae-uuc_5"
## [574] "trnag-ucc_2" "trnak-uuu_2" "trnam-cau_1"
## [577] "trnam-cau_2" "trnav-aac_49" "trnam-cau_3"
## [580] "trnag-ucc_3" "trnak-cuu_49" "trnam-cau_4"
## [583] "trnav-aac_50" "trnam-cau_5" "trnav-aac_51"
## [586] "trnae-uuc_6" "trnag-ucc_4" "trnav-aac_52"
## [589] "trnag-ucc_5" "trnam-cau_6" "trnav-aac_53"
## [592] "trnav-cac_45" "trnak-cuu_50" "trnav-cac_46"
## [595] "trnak-cuu_51" "trnav-cac_47" "trnae-uuc_7"
## [598] "trnak-uuu_3" "trnam-cau_7" "trnak-uuu_4"
## [601] "trnav-cac_48" "trnan-guu_2" "trnam-cau_8"
## [604] "trnak-uuu_5" "trnan-guu_3" "trnan-guu_4"
## [607] "trnav-cac_49" "trnam-cau_9" "trnak-uuu_6"
## [610] "trnan-guu_5" "trnav-cac_50" "trnav-cac_51"
## [613] "trnam-cau_10" "trnan-guu_6" "trnae-cuc_43"
## [616] "trnam-cau_11" "trnan-guu_7" "trnav-cac_52"
## [619] "trnae-cuc_44" "trnae-cuc_45" "trnam-cau_12"
## [622] "trnae-cuc_46" "trnan-guu_8" "trnav-cac_53"
## [625] "trnak-cuu_52" "trnah-gug_40" "trnav-aac_54"
## [628] "trnak-uuu_7" "trnah-gug_41" "trnah-gug_42"
## [631] "trnak-uuu_8" "trnah-gug_43" "trnak-uuu_9"
## [634] "trnak-uuu_10" "trnak-cuu_53" "trnak-uuu_11"
## [637] "trnar-acg_49" "trnar-acg_50" "trnar-acg_51"
## [640] "trnar-acg_52" "trnar-acg_53" "trnar-acg_54"
## [643] "trnar-acg_55" "trnah-gug_44" "trnae-cuc_47"
## [646] "trnar-ccu_39" "trnah-gug_45" "trnaa-agc_1"
## [649] "trnae-uuc_8" "trnag-ucc_6" "trnav-cac_54"
## [652] "trnag-gcc_43" "trnak-uuu_12" "trnav-aac_55"
## [655] "trnak-cuu_54" "trnak-uuu_13" "trnag-ucc_7"
## [658] "trnav-aac_56" "trnak-cuu_55" "trnag-gcc_44"
## [661] "trnag-ucc_8" "trnak-uuu_14" "trnar-acg_56"
## [664] "trnar-acg_57" "trnav-aac_57" "trnak-cuu_56"
## [667] "trnag-gcc_45" "trnag-ucc_9" "trnav-aac_58"
## [670] "trnak-cuu_57" "trnak-uuu_15" "trnar-acg_58"
## [673] "trnar-acg_59" "trnar-acg_60" "trnar-acg_61"
## [676] "trnak-uuu_16" "trnak-cuu_58" "trnav-aac_59"
## [679] "trnag-ucc_10" "trnar-ccu_40" "trnak-uuu_17"
## [682] "trnak-cuu_59" "trnav-aac_60" "trnag-ucc_11"
## [685] "trnah-gug_46" "trnav-cac_55" "trnae-cuc_48"
## [688] "trnar-ccu_41" "trnak-uuu_18" "trnag-gcc_46"
## [691] "trnak-cuu_60" "trnav-aac_61" "trnar-ccu_42"
## [694] "trnak-uuu_19" "trnag-ucc_12" "trnak-cuu_61"
## [697] "trnav-cac_56" "trnar-ccu_43" "trnak-uuu_20"
## [700] "trnag-ucc_13" "trnah-gug_47" "trnak-cuu_62"
## [703] "trnav-cac_57" "trnae-cuc_49" "trnar-ccu_44"
## [706] "trnar-ccu_45" "trnav-aac_62" "trnah-gug_48"
## [709] "trnag-gcc_47" "trnag-ucc_14" "trnag-ucc_15"
## [712] "trnag-gcc_48" "trnav-aac_63" "trnae-cuc_50"
## [715] "trnak-uuu_21" "trnah-gug_49" "trnag-gcc_49"
## [718] "trnag-gcc_50" "trnae-cuc_51" "trnah-gug_50"
## [721] "trnar-ccu_46" "trnar-ccu_47" "trnar-ccu_48"
## [724] "trnal-uag_1" "trnap-agg_13" "trnan-guu_9"
## [727] "trnak-cuu_63" "trnal-uag_2" "trnap-agg_14"
## [730] "trnak-cuu_64" "trnal-aag_1" "trnap-agg_15"
## [733] "trnak-cuu_65" "trnal-aag_2" "trnak-cuu_66"
## [736] "trnal-aag_3" "trnap-ugg_6" "trnaf-gaa_2"
## [739] "trnam-cau_13" "trnai-aau_8" "trnai-aau_9"
## [742] "trnai-aau_10" "trnai-aau_11" "trnai-aau_12"
## [745] "trnai-aau_13" "trnai-aau_14" "trnas-gcu_1"
## [748] "trnai-aau_15" "trnai-aau_16" "trnal-aag_4"
## [751] "trnap-cgg_2" "trnaw-cca_16" "trnas-gcu_2"
## [754] "trnal-aag_5" "trnap-ugg_7" "trnal-aag_6"
## [757] "trnap-agg_16" "trnal-uag_3" "trnag-gcc_51"
## [760] "trnag-gcc_52" "trnag-gcc_53" "trnag-gcc_54"
## [763] "trnag-gcc_55" "trnag-gcc_56" "trnag-gcc_57"
## [766] "trnal-aag_7" "trnal-aag_8" "trnal-uag_4"
## [769] "trnas-gcu_3" "trnai-aau_17" "trnas-cga_2"
## [772] "trnas-gcu_4" "trnap-ugg_8" "trnap-agg_17"
## [775] "trnat-agu_1" "trnap-ugg_9" "trnat-agu_2"
## [778] "trnap-ugg_10" "trnas-gcu_5" "trnat-cgu_1"
## [781] "trnat-cgu_2" "trnat-agu_3" "trnas-uga_6"
## [784] "trnap-ugg_11" "trnas-gcu_6" "trnat-cgu_3"
## [787] "trnas-uga_7" "trnap-ugg_12" "trnas-gcu_7"
## [790] "trnat-cgu_4" "trnas-uga_8" "trnas-gcu_8"
## [793] "trnat-cgu_5" "trnat-agu_4" "trnas-uga_9"
## [796] "trnas-gcu_9" "trnat-cgu_6" "trnat-agu_5"
## [799] "trnas-uga_10" "trnat-cgu_7" "trnat-agu_6"
## [802] "trnat-agu_7" "trnas-uga_11" "trnat-agu_8"
## [805] "trnas-cga_3" "trnat-cgu_8" "trnat-agu_9"
## [808] "trnat-agu_10" "trnas-cga_4" "trnat-agu_11"
## [811] "trnas-cga_5" "trnat-cgu_9" "trnat-agu_12"
## [814] "trnas-cga_6" "trnat-agu_13" "trnal-aag_9"
## [817] "trnak-cuu_67" "trnat-agu_14" "trnak-cuu_68"
## [820] "trnat-ugu_3" "trnat-ugu_4" "trnak-cuu_69"
## [823] "trnak-cuu_70" "trnat-ugu_5" "trnak-cuu_71"
## [826] "trnak-cuu_72" "trnat-agu_15" "trnak-cuu_73"
## [829] "trnat-agu_16" "trnak-cuu_74" "trnat-agu_17"
## [832] "trnak-cuu_75" "trnak-cuu_76" "trnat-agu_18"
## [835] "trnak-cuu_77" "trnak-cuu_78" "trnat-agu_19"
## [838] "trnak-cuu_79" "trnas-gcu_10" "trnat-cgu_10"
## [841] "trnas-gcu_11" "trnat-cgu_11" "trnat-agu_20"
## [844] "trnat-ugu_6" "trnat-cgu_12" "trnas-gcu_12"
## [847] "trnat-agu_21" "trnat-cgu_13" "trnas-gcu_13"
## [850] "trnas-gcu_14" "trnat-cgu_14" "trnat-agu_22"
## [853] "trnas-gcu_15" "trnat-agu_23" "trnat-agu_24"
## [856] "trnat-ugu_7" "trnag-gcc_58" "trnat-agu_25"
## [859] "trnat-ugu_8" "trnas-gcu_16" "trnat-agu_26"
## [862] "trnat-agu_27" "trnat-ugu_9" "trnas-gcu_17"
## [865] "trnat-agu_28" "trnat-cgu_15" "trnas-gcu_18"
## [868] "trnat-agu_29" "trnat-cgu_16" "trnat-ugu_10"
## [871] "trnas-gcu_19" "trnat-agu_30" "trnat-cgu_17"
## [874] "trnas-gcu_20" "trnas-gcu_21" "trnat-cgu_18"
## [877] "trnat-cgu_19" "trnas-gcu_22" "trnat-cgu_20"
## [880] "trnat-agu_31" "trnat-cgu_21" "trnat-agu_32"
## [883] "trnat-agu_33" "trnat-cgu_22" "trnas-gcu_23"
## [886] "trnat-ugu_11" "trnas-gcu_24" "trnat-cgu_23"
## [889] "trnas-gcu_25" "trnat-cgu_24" "trnas-gcu_26"
## [892] "trnat-cgu_25" "trnas-gcu_27" "trnat-agu_34"
## [895] "trnat-cgu_26" "trnas-gcu_28" "trnat-cgu_27"
## [898] "trnas-gcu_29" "trnas-gcu_30" "trnat-ugu_12"
## [901] "trnas-gcu_31" "trnat-agu_35" "trnat-ugu_13"
## [904] "trnas-cga_7" "trnat-cgu_28" "trnat-ugu_14"
## [907] "trnat-ugu_15" "trnat-ugu_16" "trnat-agu_36"
## [910] "trnat-ugu_17" "trnat-cgu_29" "trnat-ugu_18"
## [913] "trnat-agu_37" "trnat-ugu_19" "trnat-agu_38"
## [916] "trnat-ugu_20" "trnat-agu_39" "trnat-ugu_21"
## [919] "trnat-agu_40" "trnat-ugu_22" "trnat-agu_41"
## [922] "trnat-ugu_23" "trnat-agu_42" "trnat-ugu_24"
## [925] "trnat-agu_43" "trnat-ugu_25" "trnat-agu_44"
## [928] "trnat-ugu_26" "trnat-agu_45" "trnat-ugu_27"
## [931] "trnat-agu_46" "trnat-ugu_28" "trnat-agu_47"
## [934] "trnat-ugu_29" "trnat-agu_48" "trnat-ugu_30"
## [937] "trnat-agu_49" "trnat-ugu_31" "trnat-agu_50"
## [940] "trnat-ugu_32" "trnat-agu_51" "trnat-ugu_33"
## [943] "trnat-agu_52" "trnat-ugu_34" "trnat-ugu_35"
## [946] "trnat-ugu_36" "trnat-agu_53" "trnat-ugu_37"
## [949] "trnat-agu_54" "trnat-cgu_30" "trnat-ugu_38"
## [952] "trnas-uga_12" "trnap-ugg_13" "trnap-ugg_14"
## [955] "trnag-ccc_32" "trnae-uuc_9" "trnak-cuu_80"
## [958] "trnaq-cug_18" "trnav-cac_58" "trnar-acg_62"
## [961] "trnar-acg_63" "trnai-aau_18" "trnag-gcc_59"
## [964] "trnai-aau_19" "trnad-guc_55" "trnad-guc_56"
## [967] "trnai-aau_20" "trnad-guc_57" "trnad-guc_58"
## [970] "trnad-guc_59" "trnad-guc_60" "trnad-guc_61"
## [973] "trnad-guc_62" "trnad-guc_63" "trnai-aau_21"
## [976] "trnad-guc_64" "trnad-guc_65" "trnai-aau_22"
## [979] "trnad-guc_66" "trnad-guc_67" "trnai-aau_23"
## [982] "trnad-guc_68" "trnad-guc_69" "trnai-aau_24"
## [985] "trnad-guc_70" "trnad-guc_71" "trnai-aau_25"
## [988] "trnad-guc_72" "trnad-guc_73" "trnai-aau_26"
## [991] "trnad-guc_74" "trnad-guc_75" "trnai-aau_27"
## [994] "trnad-guc_76" "trnad-guc_77" "trnai-aau_28"
## [997] "trnad-guc_78" "trnad-guc_79" "trnai-aau_29"
## [1000] "trnad-guc_80" "trnai-aau_30" "trnad-guc_81"
## [1003] "trnai-aau_31" "trnai-aau_32" "trnad-guc_82"
## [1006] "trnai-aau_33" "trnad-guc_83" "trnad-guc_84"
## [1009] "trnai-aau_34" "trnad-guc_85" "trnad-guc_86"
## [1012] "trnai-aau_35" "trnad-guc_87" "trnad-guc_88"
## [1015] "trnai-aau_36" "trnap-agg_18" "trnat-agu_55"
## [1018] "trnap-ugg_15" "trnad-guc_89" "trnat-ugu_39"
## [1021] "trnag-ccc_33" "trnag-ccc_34" "trnag-ccc_35"
## [1024] "trnaa-ugc_3" "trnak-uuu_22" "trnaf-gaa_3"
## [1027] "trnay-gua_4" "trnam-cau_14" "trnan-guu_10"
## [1030] "trnaa-ugc_4" "trnal-cag_1" "trnaf-gaa_4"
## [1033] "trnam-cau_15" "trnaa-ugc_5" "trnal-cag_2"
## [1036] "trnak-uuu_23" "trnaf-gaa_5" "trnay-gua_5"
## [1039] "trnam-cau_16" "trnam-cau_17" "trnaa-ugc_6"
## [1042] "trnak-uuu_24" "trnaf-gaa_6" "trnay-gua_6"
## [1045] "trnam-cau_18" "trnan-guu_11" "trnaa-ugc_7"
## [1048] "trnal-cag_3" "trnak-uuu_25" "trnam-cau_19"
## [1051] "trnam-cau_20" "trnan-guu_12" "trnaa-ugc_8"
## [1054] "trnal-cag_4" "trnak-uuu_26" "trnaf-gaa_7"
## [1057] "trnay-gua_7" "trnam-cau_21" "trnam-cau_22"
## [1060] "trnan-guu_13" "trnaa-ugc_9" "trnal-cag_5"
## [1063] "trnak-uuu_27" "trnaf-gaa_8" "trnay-gua_8"
## [1066] "trnam-cau_23" "trnam-cau_24" "trnan-guu_14"
## [1069] "trnaa-ugc_10" "trnal-cag_6" "trnak-uuu_28"
## [1072] "trnaf-gaa_9" "trnay-gua_9" "trnam-cau_25"
## [1075] "trnam-cau_26" "trnan-guu_15" "trnaa-ugc_11"
## [1078] "trnal-cag_7" "trnak-uuu_29" "trnaf-gaa_10"
## [1081] "trnay-gua_10" "trnam-cau_27" "trnam-cau_28"
## [1084] "trnan-guu_16" "trnaa-ugc_12" "trnal-cag_8"
## [1087] "trnak-uuu_30" "trnaf-gaa_11" "trnay-gua_11"
## [1090] "trnam-cau_29" "trnam-cau_30" "trnan-guu_17"
## [1093] "trnaa-ugc_13" "trnal-cag_9" "trnak-uuu_31"
## [1096] "trnaf-gaa_12" "trnay-gua_12" "trnam-cau_31"
## [1099] "trnam-cau_32" "trnan-guu_18" "trnaa-ugc_14"
## [1102] "trnal-cag_10" "trnak-uuu_32" "trnaf-gaa_13"
## [1105] "trnay-gua_13" "trnam-cau_33" "trnam-cau_34"
## [1108] "trnan-guu_19" "trnaa-ugc_15" "trnal-cag_11"
## [1111] "trnak-uuu_33" "trnaf-gaa_14" "trnay-gua_14"
## [1114] "trnam-cau_35" "trnam-cau_36" "trnan-guu_20"
## [1117] "trnaa-ugc_16" "trnal-cag_12" "trnak-uuu_34"
## [1120] "trnaf-gaa_15" "trnay-gua_15" "trnam-cau_37"
## [1123] "trnam-cau_38" "trnan-guu_21" "trnaa-ugc_17"
## [1126] "trnal-cag_13" "trnak-uuu_35" "trnaf-gaa_16"
## [1129] "trnay-gua_16" "trnam-cau_39" "trnam-cau_40"
## [1132] "trnan-guu_22" "trnaa-ugc_18" "trnal-cag_14"
## [1135] "trnak-uuu_36" "trnaf-gaa_17" "trnay-gua_17"
## [1138] "trnam-cau_41" "trnam-cau_42" "trnan-guu_23"
## [1141] "trnaa-ugc_19" "trnal-cag_15" "trnaf-gaa_18"
## [1144] "trnay-gua_18" "trnam-cau_43" "trnam-cau_44"
## [1147] "trnan-guu_24" "trnaa-ugc_20" "trnal-cag_16"
## [1150] "trnak-uuu_37" "trnaf-gaa_19" "trnay-gua_19"
## [1153] "trnam-cau_45" "trnam-cau_46" "trnak-uuu_38"
## [1156] "trnal-cag_17" "trnak-uuu_39" "trnaf-gaa_20"
## [1159] "trnay-gua_20" "trnam-cau_47" "trnam-cau_48"
## [1162] "trnan-guu_25" "trnaa-ugc_21" "trnal-cag_18"
## [1165] "trnak-uuu_40" "trnaf-gaa_21" "trnay-gua_21"
## [1168] "trnam-cau_49" "trnam-cau_50" "trnan-guu_26"
## [1171] "trnaa-ugc_22" "trnal-cag_19" "trnak-uuu_41"
## [1174] "trnaf-gaa_22" "trnay-gua_22" "trnam-cau_51"
## [1177] "trnam-cau_52" "trnan-guu_27" "trnaa-ugc_23"
## [1180] "trnal-cag_20" "trnak-uuu_42" "trnan-guu_28"
## [1183] "trnaf-gaa_23" "trnay-gua_23" "trnan-guu_29"
## [1186] "trnak-uuu_43" "trnaf-gaa_24" "trnay-gua_24"
## [1189] "trnam-cau_53" "trnal-cag_21" "trnak-uuu_44"
## [1192] "trnaf-gaa_25" "trnam-cau_54" "trnan-guu_30"
## [1195] "trnaa-ugc_24" "trnal-cag_22" "trnak-uuu_45"
## [1198] "trnaf-gaa_26" "trnay-gua_25" "trnas-aga_22"
## [1201] "trnaq-uug_22" "trnas-aga_23" "trnaq-uug_23"
## [1204] "trnas-aga_24" "trnaq-uug_24" "trnas-aga_25"
## [1207] "trnaq-uug_25" "trnas-aga_26" "trnaq-uug_26"
## [1210] "trnas-aga_27" "trnaq-uug_27" "trnas-aga_28"
## [1213] "trnay-gua_26" "trnak-uuu_46" "trnam-cau_55"
## [1216] "trnay-gua_27" "trnag-ucc_16" "trnae-uuc_10"
## [1219] "trnaa-agc_2" "trnaf-gaa_27" "trnak-uuu_47"
## [1222] "trnam-cau_56" "trnay-gua_28" "trnag-ucc_17"
## [1225] "trnae-uuc_11" "trnaa-agc_3" "trnaf-gaa_28"
## [1228] "trnak-uuu_48" "trnam-cau_57" "trnay-gua_29"
## [1231] "trnag-ucc_18" "trnae-uuc_12" "trnaa-agc_4"
## [1234] "trnaf-gaa_29" "trnak-uuu_49" "trnam-cau_58"
## [1237] "trnay-gua_30" "trnag-ucc_19" "trnae-uuc_13"
## [1240] "trnaa-agc_5" "trnaf-gaa_30" "trnak-uuu_50"
## [1243] "trnam-cau_59" "trnay-gua_31" "trnag-ucc_20"
## [1246] "trnae-uuc_14" "trnaa-agc_6" "trnaf-gaa_31"
## [1249] "trnak-uuu_51" "trnam-cau_60" "trnay-gua_32"
## [1252] "trnag-ucc_21" "trnae-uuc_15" "trnaa-agc_7"
## [1255] "trnaf-gaa_32" "trnak-uuu_52" "trnam-cau_61"
## [1258] "trnay-gua_33" "trnag-ucc_22" "trnae-uuc_16"
## [1261] "trnaa-agc_8" "trnaf-gaa_33" "trnak-uuu_53"
## [1264] "trnam-cau_62" "trnay-gua_34" "trnag-ucc_23"
## [1267] "trnae-uuc_17" "trnaa-agc_9" "trnaf-gaa_34"
## [1270] "trnak-uuu_54" "trnam-cau_63" "trnay-gua_35"
## [1273] "trnag-ucc_24" "trnae-uuc_18" "trnaa-agc_10"
## [1276] "trnaf-gaa_35" "trnak-uuu_55" "trnam-cau_64"
## [1279] "trnag-ucc_25" "trnak-uuu_56" "trnam-cau_65"
## [1282] "trnak-uuu_57" "trnaf-gaa_36" "trnae-uuc_19"
## [1285] "trnam-cau_66" "trnak-uuu_58" "trnaf-gaa_37"
## [1288] "trnaa-agc_11" "trnae-uuc_20" "trnag-ucc_26"
## [1291] "trnay-gua_36" "trnam-cau_67" "trnak-uuu_59"
## [1294] "trnaf-gaa_38" "trnaa-agc_12" "trnae-uuc_21"
## [1297] "trnag-ucc_27" "trnay-gua_37" "trnam-cau_68"
## [1300] "trnak-uuu_60" "trnaf-gaa_39" "trnaa-agc_13"
## [1303] "trnae-uuc_22" "trnag-ucc_28" "trnay-gua_38"
## [1306] "trnam-cau_69" "trnak-uuu_61" "trnaf-gaa_40"
## [1309] "trnaa-agc_14" "trnae-uuc_23" "trnag-ucc_29"
## [1312] "trnay-gua_39" "trnam-cau_70" "trnak-uuu_62"
## [1315] "trnaf-gaa_41" "trnaa-agc_15" "trnae-uuc_24"
## [1318] "trnag-ucc_30" "trnay-gua_40" "trnam-cau_71"
## [1321] "trnak-uuu_63" "trnaf-gaa_42" "trnaa-agc_16"
## [1324] "trnae-uuc_25" "trnag-ucc_31" "trnay-gua_41"
## [1327] "trnam-cau_72" "trnak-uuu_64" "trnaq-uug_28"
## [1330] "trnae-cuc_52" "trnag-ucc_32" "trnag-ucc_33"
## [1333] "trnag-ucc_34" "trnag-ucc_35" "trnam-cau_73"
## [1336] "trnak-cuu_81" "trnam-cau_74" "trnak-uuu_65"
## [1339] "trnam-cau_75" "trnak-uuu_66" "trnak-uuu_67"
## [1342] "trnai-uau_1" "trnai-uau_2" "trnak-cuu_82"
## [1345] "trnav-cac_59" "trnan-guu_31" "trnae-cuc_53"
## [1348] "trnam-cau_76" "trnan-guu_32" "trnae-cuc_54"
## [1351] "trnav-cac_60" "trnam-cau_77" "trnai-uau_3"
## [1354] "trnak-uuu_68" "trnah-gug_51" "trnah-gug_52"
## [1357] "trnak-uuu_69" "trnar-acg_64" "trnar-acg_65"
## [1360] "trnar-acg_66" "trnar-acg_67" "trnar-acg_68"
## [1363] "trnar-acg_69" "trnar-acg_70" "trnar-acg_71"
## [1366] "trnar-acg_72" "trnar-acg_73" "trnar-acg_74"
## [1369] "trnak-uuu_70" "trnar-ccu_49" "trnak-cuu_83"
## [1372] "trnae-cuc_55" "trnah-gug_53" "trnak-cuu_84"
## [1375] "trnae-cuc_56" "trnah-gug_54" "trnak-cuu_85"
## [1378] "trnae-cuc_57" "trnah-gug_55" "trnak-cuu_86"
## [1381] "trnae-cuc_58" "trnah-gug_56" "trnak-cuu_87"
## [1384] "trnar-ccu_50" "trnae-cuc_59" "trnak-cuu_88"
## [1387] "trnai-uau_4" "trnap-ugg_16" "trnap-agg_19"
## [1390] "trnai-uau_5" "trnap-ugg_17" "trnai-uau_6"
## [1393] "trnap-ugg_18" "trnap-agg_20" "trnai-uau_7"
## [1396] "trnap-ugg_19" "trnap-agg_21" "trnai-uau_8"
## [1399] "trnap-ugg_20" "trnap-agg_22" "trnai-uau_9"
## [1402] "trnap-ugg_21" "trnai-uau_10" "trnap-ugg_22"
## [1405] "trnap-agg_23" "trnai-uau_11" "trnap-ugg_23"
## [1408] "trnap-agg_24" "trnai-uau_12" "trnap-ugg_24"
## [1411] "trnap-agg_25" "trnai-uau_13" "trnap-ugg_25"
## [1414] "trnap-agg_26" "trnai-uau_14" "trnap-ugg_26"
## [1417] "trnap-agg_27" "trnai-uau_15" "trnap-ugg_27"
## [1420] "trnap-agg_28" "trnai-uau_16" "trnap-ugg_28"
## [1423] "trnap-agg_29" "trnai-uau_17" "trnap-ugg_29"
## [1426] "trnap-agg_30" "trnai-uau_18" "trnap-ugg_30"
## [1429] "trnap-agg_31" "trnai-uau_19" "trnap-ugg_31"
## [1432] "trnap-agg_32" "trnai-uau_20" "trnap-ugg_32"
## [1435] "trnap-agg_33" "trnai-uau_21" "trnap-ugg_33"
## [1438] "trnap-agg_34" "trnai-uau_22" "trnap-ugg_34"
## [1441] "trnap-agg_35" "trnai-uau_23" "trnap-ugg_35"
## [1444] "trnap-agg_36" "trnai-uau_24" "trnai-uau_25"
## [1447] "trnap-ugg_36" "trnai-uau_26" "trnai-uau_27"
## [1450] "trnap-agg_37" "trnai-uau_28" "trnap-ugg_37"
## [1453] "trnay-gua_42" "trnae-uuc_26" "trnay-gua_43"
## [1456] "trnam-cau_78" "trnae-uuc_27" "trnaq-uug_29"
## [1459] "trnay-gua_44" "trnag-ucc_36" "trnaq-uug_30"
## [1462] "trnaq-uug_31" "trnaq-uug_32" "trnaq-uug_33"
## [1465] "trnah-gug_57" "trnar-ccu_51" "trnai-aau_37"
## [1468] "trnai-aau_38" "trnai-aau_39" "trnai-aau_40"
## [1471] "trnai-aau_41" "trnas-gcu_32" "trnai-aau_42"
## [1474] "trnai-aau_43" "trnai-aau_44" "trnas-gcu_33"
## [1477] "trnai-aau_45" "trnai-aau_46" "trnas-gcu_34"
## [1480] "trnai-aau_47" "trnai-aau_48" "trnai-aau_49"
## [1483] "trnai-aau_50" "trnal-aag_10" "trnap-cgg_3"
## [1486] "trnaw-cca_17" "trnas-gcu_35" "trnal-aag_11"
## [1489] "trnap-ugg_38" "trnal-aag_12" "trnal-aag_13"
## [1492] "trnal-aag_14" "trnap-agg_38" "trnal-uag_5"
## [1495] "trnag-gcc_60" "trnar-ucu_1" "trnag-gcc_61"
## [1498] "trnag-gcc_62" "trnag-gcc_63" "trnag-gcc_64"
## [1501] "trnag-gcc_65" "trnag-gcc_66" "trnag-gcc_67"
## [1504] "trnal-aag_15" "trnal-uag_6" "trnal-aag_16"
## [1507] "trnal-uag_7" "trnal-uag_8" "trnal-aag_17"
## [1510] "trnal-uag_9" "trnal-aag_18" "trnas-gcu_36"
## [1513] "trnat-cgu_31" "trnas-gcu_37" "trnat-agu_56"
## [1516] "trnat-agu_57" "trnas-gcu_38" "trnat-agu_58"
## [1519] "trnas-gcu_39" "trnat-agu_59" "trnat-agu_60"
## [1522] "trnas-gcu_40" "trnat-agu_61" "trnat-agu_62"
## [1525] "trnas-gcu_41" "trnat-agu_63" "trnat-agu_64"
## [1528] "trnas-gcu_42" "trnat-cgu_32" "trnat-agu_65"
## [1531] "trnat-agu_66" "trnas-gcu_43" "trnat-cgu_33"
## [1534] "trnas-cga_8" "trnas-gcu_44" "trnas-cga_9"
## [1537] "trnas-gcu_45" "trnat-cgu_34" "trnas-gcu_46"
## [1540] "trnat-cgu_35" "trnas-cga_10" "trnas-gcu_47"
## [1543] "trnas-cga_11" "trnas-cga_12" "trnas-cga_13"
## [1546] "trnas-cga_14" "trnat-agu_67" "trnas-cga_15"
## [1549] "trnas-cga_16" "trnas-aga_29" "trnas-uga_13"
## [1552] "trnat-agu_68" "trnas-cga_17" "trnas-cga_18"
## [1555] "trnas-cga_19" "trnas-cga_20" "trnat-agu_69"
## [1558] "trnas-cga_21" "trnas-cga_22" "trnat-agu_70"
## [1561] "trnas-cga_23" "trnas-cga_24" "trnas-uga_14"
## [1564] "trnas-uga_15" "trnas-cga_25" "trnas-cga_26"
## [1567] "trnat-agu_71" "trnas-cga_27" "trnas-cga_28"
## [1570] "trnat-agu_72" "trnas-uga_16" "trnas-cga_29"
## [1573] "trnat-agu_73" "trnas-cga_30" "trnat-agu_74"
## [1576] "trnas-cga_31" "trnas-cga_32" "trnas-uga_17"
## [1579] "trnat-agu_75" "trnas-cga_33" "trnas-cga_34"
## [1582] "trnas-cga_35" "trnas-cga_36" "trnat-agu_76"
## [1585] "trnas-cga_37" "trnat-agu_77" "trnas-aga_30"
## [1588] "trnas-cga_38" "trnas-cga_39" "trnas-cga_40"
## [1591] "trnas-cga_41" "trnat-agu_78" "trnas-cga_42"
## [1594] "trnas-cga_43" "trnat-agu_79" "trnas-cga_44"
## [1597] "trnas-cga_45" "trnas-cga_46" "trnas-cga_47"
## [1600] "trnas-uga_18" "trnas-cga_48" "trnas-cga_49"
## [1603] "trnat-agu_80" "trnas-cga_50" "trnas-cga_51"
## [1606] "trnal-aag_19" "trnat-ugu_40" "trnat-cgu_36"
## [1609] "trnas-cga_52" "trnat-cgu_37" "trnas-uga_19"
## [1612] "trnav-cac_61" "trnaa-ugc_25" "trnaa-ugc_26"
## [1615] "trnaa-cgc_1" "trnaa-cgc_2" "trnaa-ugc_27"
## [1618] "trnaa-agc_17" "trnat-ugu_41" "trnat-ugu_42"
## [1621] "trnat-ugu_43" "trnat-ugu_44" "trnat-ugu_45"
## [1624] "trnat-ugu_46" "trnat-ugu_47" "trnat-ugu_48"
## [1627] "trnat-ugu_49" "trnat-ugu_50" "trnat-ugu_51"
## [1630] "trnat-ugu_52" "trnat-ugu_53" "trnai-uau_29"
## [1633] "trnat-ugu_54" "trnat-ugu_55" "trnav-uac_2"
## [1636] "trnav-uac_3" "trnav-uac_4" "trnaw-cca_18"
## [1639] "trnaw-cca_19" "trnaw-cca_20" "trnaw-cca_21"
## [1642] "trnaw-cca_22" "trnav-cac_62" "trnat-ugu_56"
## [1645] "trnas-uga_20" "trnat-agu_81" "trnaa-cgc_3"
## [1648] "trnaa-ugc_28" "trnat-agu_82" "trnat-ugu_57"
## [1651] "trnaw-cca_23" "trnaa-ugc_29" "trnaw-cca_24"
## [1654] "trnat-ugu_58" "trnaw-cca_25" "trnaa-ugc_30"
## [1657] "trnaw-cca_26" "trnaa-ugc_31" "trnat-ugu_59"
## [1660] "trnaw-cca_27" "trnaw-cca_28" "trnaa-ugc_32"
## [1663] "trnam-cau_79" "trnaa-ugc_33" "trnaw-cca_29"
## [1666] "trnaq-uug_34" "trnaw-cca_30" "trnaq-uug_35"
## [1669] "trnaa-ugc_34" "trnaw-cca_31" "trnaw-cca_32"
## [1672] "trnaq-uug_36" "trnat-ugu_60" "trnaq-uug_37"
## [1675] "trnaa-cgc_4" "trnaw-cca_33" "trnat-ugu_61"
## [1678] "trnaw-cca_34" "trnat-ugu_62" "trnaw-cca_35"
## [1681] "trnat-ugu_63" "trnaa-ugc_35" "trnaw-cca_36"
## [1684] "trnaw-cca_37" "trnat-ugu_64" "trnaa-ugc_36"
## [1687] "trnat-agu_83" "trnaw-cca_38" "trnaq-uug_38"
## [1690] "trnat-agu_84" "trnaw-cca_39" "trnaq-uug_39"
## [1693] "trnaa-ugc_37" "trnat-agu_85" "trnaw-cca_40"
## [1696] "trnaa-ugc_38" "trnat-agu_86" "trnai-aau_51"
## [1699] "trnaq-cug_19" "trnaq-uug_40" "trnai-aau_52"
## [1702] "trnaq-uug_41" "trnaq-uug_42" "trnai-aau_53"
## [1705] "trnaq-uug_43" "trnaq-uug_44" "trnai-aau_54"
## [1708] "trnaq-uug_45" "trnaq-uug_46" "trnai-aau_55"
## [1711] "trnaq-cug_20" "trnaq-uug_47" "trnai-aau_56"
## [1714] "trnaq-cug_21" "trnaq-uug_48" "trnai-aau_57"
## [1717] "trnaq-cug_22" "trnaq-uug_49" "trnai-aau_58"
## [1720] "trnaq-cug_23" "trnaq-uug_50" "trnai-aau_59"
## [1723] "trnaq-uug_51" "trnaq-uug_52" "trnai-aau_60"
## [1726] "trnaq-cug_24" "trnaq-uug_53" "trnai-aau_61"
## [1729] "trnaq-cug_25" "trnaq-uug_54" "trnai-aau_62"
## [1732] "trnaq-uug_55" "trnai-aau_63" "trnaq-cug_26"
## [1735] "trnaq-uug_56" "trnai-aau_64" "trnaq-uug_57"
## [1738] "trnaq-uug_58" "trnai-aau_65" "trnaq-cug_27"
## [1741] "trnaq-uug_59" "trnai-aau_66" "trnaq-cug_28"
## [1744] "trnaq-uug_60" "trnaq-uug_61" "trnai-aau_67"
## [1747] "trnaq-uug_62" "trnai-aau_68" "trnaa-ugc_39"
## [1750] "trnaa-ugc_40" "trnaa-cgc_5" "trnaa-ugc_41"
## [1753] "trnaa-ugc_42" "trnaa-cgc_6" "trnaa-ugc_43"
## [1756] "trnaa-cgc_7" "trnaa-ugc_44" "trnaa-cgc_8"
## [1759] "trnaa-ugc_45" "trnaa-ugc_46" "trnaa-cgc_9"
## [1762] "trnaa-ugc_47" "trnaa-cgc_10" "trnaa-ugc_48"
## [1765] "trnaa-cgc_11" "trnaa-ugc_49" "trnaa-cgc_12"
## [1768] "trnaa-ugc_50" "trnaa-cgc_13" "trnaa-ugc_51"
## [1771] "trnaa-ugc_52" "trnaa-ugc_53" "trnaa-cgc_14"
## [1774] "trnaa-ugc_54" "trnaa-ugc_55" "trnaa-cgc_15"
## [1777] "trnaa-ugc_56" "trnaa-cgc_16" "trnaa-ugc_57"
## [1780] "trnaa-ugc_58" "trnal-cag_23" "trnal-caa_1"
## [1783] "trnaq-cug_29" "trnal-caa_2" "trnad-guc_90"
## [1786] "trnad-guc_91" "trnai-aau_69" "trnaa-cgc_17"
## [1789] "trnaa-cgc_18" "trnaa-cgc_19" "trnaa-cgc_20"
## [1792] "trnaa-cgc_21" "trnaa-cgc_22" "trnaa-cgc_23"
## [1795] "trnam-cau_80" "trnat-agu_87" "trnag-gcc_68"
## [1798] "trnag-gcc_69" "trnag-gcc_70" "trnag-gcc_71"
## [1801] "trnag-gcc_72" "trnag-gcc_73" "trnar-ucu_2"
## [1804] "trnal-uaa_15" "trnar-ccg_1" "trnaq-cug_30"
## [1807] "trnar-ccg_2" "trnan-guu_33" "trnaf-gaa_43"
## [1810] "trnat-ugu_65" "trnat-ugu_66" "trnat-ugu_67"
## [1813] "trnat-ugu_68" "trnat-ugu_69" "trnat-ugu_70"
## [1816] "trnat-cgu_38" "trnai-aau_70" "trnai-aau_71"
## [1819] "trnai-aau_72" "trnad-guc_92" "trnad-guc_93"
## [1822] "trnal-caa_3" "trnaq-uug_63" "trnal-caa_4"
## [1825] "trnal-caa_5" "trnal-caa_6" "trnaq-cug_31"
## [1828] "trnal-caa_7" "trnal-cag_24" "trnaa-ugc_59"
## [1831] "trnal-caa_8" "trnaa-ugc_60" "trnaa-cgc_24"
## [1834] "trnal-caa_9" "trnal-caa_10" "trnaa-cgc_25"
## [1837] "trnal-caa_11" "trnaa-cgc_26" "trnai-aau_73"
## [1840] "trnal-caa_12" "trnaa-ugc_61" "trnal-caa_13"
## [1843] "trnat-agu_88" "trnaw-cca_41" "trnaa-ugc_62"
## [1846] "trnat-cgu_39" "trnaw-cca_42" "trnat-ugu_71"
## [1849] "trnat-agu_89" "trnaa-ugc_63" "trnaa-cgc_27"
## [1852] "trnat-agu_90" "trnas-uga_21" "trnav-cac_63"
## [1855] "trnaw-cca_43" "trnaw-cca_44" "trnaw-cca_45"
## [1858] "trnaw-cca_46" "trnaw-cca_47" "trnaw-cca_48"
## [1861] "trnav-uac_5" "trnav-uac_6" "trnav-uac_7"
## [1864] "trnav-uac_8" "trnav-uac_9" "trnav-uac_10"
## [1867] "trnat-ugu_72" "trnat-ugu_73" "trnat-ugu_74"
## [1870] "trnat-ugu_75" "trnat-ugu_76" "trnat-ugu_77"
## [1873] "trnaa-agc_18" "trnaa-ugc_64" "trnac-gca_2"
## [1876] "trnam-cau_81" "trnat-agu_91" "trnag-gcc_74"
## [1879] "trnag-gcc_75" "trnag-gcc_76" "trnar-ucu_3"
## [1882] "trnar-ccg_3" "trnar-ccg_4" "trnar-ccg_5"
## [1885] "trnan-guu_34" "trnaf-gaa_44" "trnat-ugu_78"
## [1888] "trnat-ugu_79" "trnac-gca_3" "trnac-gca_4"
## [1891] "trnac-gca_5" "trnac-gca_6" "trnac-gca_7"
## [1894] "trnai-uau_30" "trnas-gcu_48" "trnal-uaa_16"
## [1897] "trnal-uaa_17" "trnal-uaa_18" "trnal-uaa_19"
## [1900] "trnal-uaa_20" "trnal-uaa_21" "trnal-uaa_22"
## [1903] "trnal-uaa_23" "trnal-uaa_24" "trnal-uaa_25"
## [1906] "trnal-uaa_26" "trnal-uaa_27" "trnal-uaa_28"
## [1909] "trnal-uaa_29" "trnal-uaa_30" "trnal-uaa_31"
## [1912] "trnal-uaa_32" "trnal-uaa_33" "trnal-uaa_34"
## [1915] "trnal-uaa_35" "trnal-uaa_36" "trnal-uaa_37"
## [1918] "trnal-uaa_38" "trnal-uaa_39" "trnal-uaa_40"
## [1921] "trnal-uaa_41" "trnal-uaa_42" "trnal-uaa_43"
## [1924] "trnal-uaa_44" "trnal-uaa_45" "trnal-uaa_46"
## [1927] "trnal-uaa_47" "trnal-uaa_48" "trnal-uaa_49"
## [1930] "trnal-uaa_50" "trnal-uaa_51" "trnal-uaa_52"
## [1933] "trnal-uaa_53" "trnal-uaa_54" "trnal-uaa_55"
## [1936] "trnal-uaa_56" "trnal-uaa_57" "trnal-uaa_58"
## [1939] "trnal-uaa_59" "trnal-uaa_60" "trnal-uaa_61"
## [1942] "trnal-uaa_62" "trnae-cuc_60" "trnac-gca_8"
## [1945] "trnas-aga_31" "trnal-cag_25" "trnae-uuc_28"
## [1948] "trnaa-agc_19" "trnay-gua_45" "trnaa-agc_20"
## [1951] "trnay-gua_46" "trnaa-agc_21" "trnae-uuc_29"
## [1954] "trnae-uuc_30" "trnay-gua_47" "trnaa-agc_22"
## [1957] "trnaa-agc_23" "trnaa-agc_24" "trnay-gua_48"
## [1960] "trnae-uuc_31" "trnaa-agc_25" "trnaa-agc_26"
## [1963] "trnae-uuc_32" "trnaa-agc_27" "trnay-gua_49"
## [1966] "trnae-uuc_33" "trnaa-agc_28" "trnay-gua_50"
## [1969] "trnay-gua_51" "trnae-uuc_34" "trnae-uuc_35"
## [1972] "trnay-gua_52" "trnae-uuc_36" "trnae-uuc_37"
## [1975] "trnas-gcu_49" "trnai-uau_31" "trnac-gca_9"
## [1978] "trnac-gca_10" "trnac-gca_11" "trnac-gca_12"
## [1981] "trnac-gca_13" "trnac-gca_14" "trnac-gca_15"
## [1984] "trnal-uaa_63" "trnal-uaa_64" "trnal-uaa_65"
## [1987] "trnal-uaa_66" "trnal-uaa_67" "trnal-uaa_68"
## [1990] "trnal-uaa_69" "trnal-uaa_70" "trnaa-cgc_28"
## [1993] "trnas-aga_32" "trnay-gua_53" "trnay-gua_54"
## [1996] "trnae-uuc_38" "trnay-gua_55" "trnay-gua_56"
## [1999] "trnay-gua_57" "trnae-uuc_39" "trnay-gua_58"
## [2002] "trnaa-agc_29" "trnay-gua_59" "trnay-gua_60"
## [2005] "trnay-gua_61" "trnaa-agc_30" "trnaa-agc_31"
## [2008] "trnaa-agc_32" "trnaa-agc_33" "trnaa-agc_34"
## [2011] "trnay-gua_62" "trnaa-agc_35" "trnay-gua_63"
## [2014] "trnaa-agc_36" "trnaa-agc_37" "trnay-gua_64"
## [2017] "trnay-gua_65" "trnay-gua_66" "trnaa-agc_38"
## [2020] "trnae-uuc_40" "trnay-gua_67" "trnay-gua_68"
## [2023] "trnay-gua_69" "trnae-uuc_41" "trnae-uuc_42"
## [2026] "trnay-gua_70" "trnae-uuc_43" "trnae-uuc_44"
## [2029] "trnan-guu_35" "trnat-agu_92" "trnaa-agc_39"
## [2032] "trnav-uac_11" "trnav-uac_12" "trnav-uac_13"
## [2035] "trnag-ucc_37" "trnav-uac_14" "trnav-uac_15"
## [2038] "trnan-guu_36" "trnak-uuu_71" "trnam-cau_82"
## [2041] "trnaa-agc_40" "trnav-uac_16" "trnav-uac_17"
## [2044] "trnav-uac_18" "trnav-uac_19" "trnav-uac_20"
## [2047] "trnav-uac_21" "trnav-uac_22" "trnav-uac_23"
## [2050] "trnav-uac_24" "trnav-uac_25" "trnav-uac_26"
## [2053] "trnav-uac_27" "trnav-uac_28" "trnav-uac_29"
## [2056] "trnav-uac_30" "trnav-uac_31" "trnav-uac_32"
## [2059] "trnav-uac_33" "trnav-uac_34" "trnav-uac_35"
## [2062] "trnav-uac_36" "trnav-uac_37" "trnav-uac_38"
## [2065] "trnav-uac_39" "trnav-uac_40" "trnav-uac_41"
## [2068] "trnav-uac_42" "trnav-uac_43" "trnav-uac_44"
## [2071] "trnav-uac_45" "trnav-uac_46" "trnav-uac_47"
## [2074] "trnaq-cug_32" "trnaf-gaa_45" "trnaf-gaa_46"
## [2077] "trnal-cag_26" "trnaf-gaa_47" "trnal-cag_27"
## [2080] "trnal-uag_10" "trnaf-gaa_48" "trnal-cag_28"
## [2083] "trnar-ccu_52" "trnal-cag_29" "trnar-ccu_53"
## [2086] "trnar-ccu_54" "trnal-cag_30" "trnal-cag_31"
## [2089] "trnal-cag_32" "trnal-cag_33" "trnaf-gaa_49"
## [2092] "trnal-cag_34" "trnaf-gaa_50" "trnal-cag_35"
## [2095] "trnal-cag_36" "trnaf-gaa_51" "trnal-cag_37"
## [2098] "trnaf-gaa_52" "trnal-cag_38" "trnal-uag_11"
## [2101] "trnas-gcu_50" "trnas-gcu_51" "trnal-aag_20"
## [2104] "trnal-uag_12" "trnap-cgg_4" "trnan-guu_37"
## [2107] "trnal-aag_21" "trnas-gcu_52" "trnal-aag_22"
## [2110] "trnal-uag_13" "trnap-cgg_5" "trnan-guu_38"
## [2113] "trnas-gcu_53" "trnal-aag_23" "trnal-uag_14"
## [2116] "trnap-cgg_6" "trnan-guu_39" "trnal-aag_24"
## [2119] "trnal-uag_15" "trnap-cgg_7" "trnan-guu_40"
## [2122] "trnas-gcu_54" "trnal-aag_25" "trnal-uag_16"
## [2125] "trnap-cgg_8" "trnan-guu_41" "trnap-cgg_9"
## [2128] "trnan-guu_42" "trnal-aag_26" "trnas-gcu_55"
## [2131] "trnal-aag_27" "trnap-cgg_10" "trnan-guu_43"
## [2134] "trnas-gcu_56" "trnal-aag_28" "trnal-uag_17"
## [2137] "trnap-cgg_11" "trnan-guu_44" "trnal-aag_29"
## [2140] "trnal-uag_18" "trnal-uag_19" "trnap-cgg_12"
## [2143] "trnan-guu_45" "trnas-gcu_57" "trnal-aag_30"
## [2146] "trnal-uag_20" "trnap-cgg_13" "trnan-guu_46"
## [2149] "trnas-gcu_58" "trnal-aag_31" "trnal-uag_21"
## [2152] "trnap-cgg_14" "trnan-guu_47" "trnas-gcu_59"
## [2155] "trnal-aag_32" "trnal-uag_22" "trnap-cgg_15"
## [2158] "trnan-guu_48" "trnas-gcu_60" "trnal-aag_33"
## [2161] "trnal-uag_23" "trnap-cgg_16" "trnan-guu_49"
## [2164] "trnas-gcu_61" "trnal-aag_34" "trnal-uag_24"
## [2167] "trnap-cgg_17" "trnan-guu_50" "trnam-cau_83"
## [2170] "trnas-gcu_62" "trnal-aag_35" "trnal-uag_25"
## [2173] "trnap-cgg_18" "trnan-guu_51" "trnam-cau_84"
## [2176] "trnas-gcu_63" "trnal-aag_36" "trnal-uag_26"
## [2179] "trnap-cgg_19" "trnan-guu_52" "trnam-cau_85"
## [2182] "trnas-gcu_64" "trnal-aag_37" "trnal-uag_27"
## [2185] "trnap-cgg_20" "trnan-guu_53" "trnam-cau_86"
## [2188] "trnas-gcu_65" "trnas-gcu_66" "trnal-aag_38"
## [2191] "trnae-cuc_61" "trnav-cac_64" "trnar-acg_75"
## [2194] "trnar-ccu_55" "trnav-aac_64" "trnae-cuc_62"
## [2197] "trnav-cac_65" "trnar-acg_76" "trnar-ccu_56"
## [2200] "trnav-aac_65" "trnae-cuc_63" "trnav-cac_66"
## [2203] "trnar-acg_77" "trnar-ccu_57" "trnav-aac_66"
## [2206] "trnae-cuc_64" "trnav-cac_67" "trnar-acg_78"
## [2209] "trnas-gcu_67" "trnas-gcu_68" "trnas-gcu_69"
## [2212] "trnam-cau_87" "trnan-guu_54" "trnap-cgg_21"
## [2215] "trnal-uag_28" "trnal-aag_39" "trnas-gcu_70"
## [2218] "trnam-cau_88" "trnan-guu_55" "trnap-cgg_22"
## [2221] "trnal-uag_29" "trnal-aag_40" "trnas-gcu_71"
## [2224] "trnam-cau_89" "trnan-guu_56" "trnap-cgg_23"
## [2227] "trnal-uag_30" "trnal-aag_41" "trnas-gcu_72"
## [2230] "trnan-guu_57" "trnai-uau_32" "trnaf-gaa_53"
## [2233] "trnar-ccu_58" "trnal-cag_39" "trnaf-gaa_54"
## [2236] "trnal-cag_40" "trnaf-gaa_55" "trnal-cag_41"
## [2239] "trnaf-gaa_56" "trnal-uag_31" "trnaf-gaa_57"
## [2242] "trnal-cag_42" "trnar-ccu_59" "trnal-cag_43"
## [2245] "trnar-ccu_60" "trnaf-gaa_58" "trnal-cag_44"
## [2248] "trnal-cag_45" "trnal-cag_46" "trnaf-gaa_59"
## [2251] "trnar-ucu_4" "trnar-ucu_5" "trnar-ucu_6"
## [2254] "trnar-ucu_7" "trnar-ucu_8" "trnak-uuu_72"
## [2257] "trnar-ucu_9" "trnar-ucu_10" "trnar-ucu_11"
## [2260] "trnak-uuu_73" "trnar-ucu_12" "trnar-ucu_13"
## [2263] "trnar-ucu_14" "trnar-ucu_15" "trnaq-cug_33"
## [2266] "trnaq-uug_64" "trnaq-cug_34" "trnas-aga_33"
## [2269] "trnaq-uug_65" "trnaq-cug_35" "trnas-aga_34"
## [2272] "trnal-caa_14" "trnav-uac_48" "trnar-ucg_1"
## [2275] "trnar-ucg_2" "trnal-caa_15" "trnar-ucg_3"
## [2278] "trnar-ucg_4" "trnal-caa_16" "trnar-ucg_5"
## [2281] "trnar-ucg_6" "trnal-caa_17" "trnar-ucg_7"
## [2284] "trnar-ucg_8" "trnal-caa_18" "trnar-ucg_9"
## [2287] "trnar-ucg_10" "trnal-caa_19" "trnar-ucg_11"
## [2290] "trnar-ucg_12" "trnal-caa_20" "trnar-ucg_13"
## [2293] "trnar-ucg_14" "trnal-caa_21" "trnar-ucg_15"
## [2296] "trnar-ucg_16" "trnal-caa_22" "trnar-ucg_17"
## [2299] "trnar-ucg_18" "trnal-caa_23" "trnar-ucg_19"
## [2302] "trnar-ucg_20" "trnal-caa_24" "trnar-ucg_21"
## [2305] "trnar-ucg_22" "trnaa-agc_41" "trnae-cuc_65"
## [2308] "trnag-ucc_38" "trnan-guu_58" "trnal-uaa_71"
## [2311] "trnae-cuc_66" "trnan-guu_59" "trnar-ucg_23"
## [2314] "trnai-uau_33" "trnak-cuu_89" "trnak-cuu_90"
## [2317] "trnai-aau_74" "trnas-gcu_73" "trnar-ucu_16"
## [2320] "trnag-ucc_39" "trnae-uuc_45" "trnai-aau_75"
## [2323] "trnak-cuu_91" "trnar-ucu_17" "trnak-uuu_74"
## [2326] "trnal-caa_25" "trnav-uac_49" "trnag-gcc_77"
## [2329] "trnag-gcc_78" "trnag-gcc_79" "trnaa-agc_42"
## [2332] "trnae-cuc_67" "trnag-ucc_40" "trnan-guu_60"
## [2335] "trnal-uaa_72" "trnae-cuc_68" "trnae-cuc_69"
## [2338] "trnan-guu_61" "trnal-uaa_73" "trnar-ucg_24"
## [2341] "trnai-uau_34" "trnak-uuu_75" "trnai-uau_35"
## [2344] "trnal-uaa_74" "trnar-ucg_25" "trnai-uau_36"
## [2347] "trnal-uaa_75" "trnae-cuc_70" "trnaq-uug_66"
## [2350] "trnai-uau_37" "trnar-ucg_26" "trnai-uau_38"
## [2353] "trnai-uau_39" "trnar-ucg_27" "trnastop-uca_1"
## [2356] "trnas-gcu_74" "trnar-ucu_18" "trnag-ucc_41"
## [2359] "trnak-cuu_92" "trnas-gcu_75" "trnac-gca_16"
## [2362] "trnac-gca_17" "trnac-gca_18" "trnac-gca_19"
## [2365] "trnac-gca_20" "trnac-gca_21" "trnac-gca_22"
## [2368] "trnac-gca_23" "trnac-gca_24" "trnac-gca_25"
## [2371] "trnac-gca_26" "trnac-gca_27" "trnac-gca_28"
## [2374] "trnac-gca_29" "trnac-gca_30" "trnac-gca_31"
## [2377] "trnac-gca_32" "trnaq-cug_36" "trnaq-cug_37"
## [2380] "trnas-aga_35" "trnas-aga_36" "trnaq-uug_67"
## [2383] "trnar-ucu_19" "trnas-aga_37" "trnas-uga_22"
## [2386] "trnaq-cug_38" "trnas-aga_38" "trnan-guu_62"
## [2389] "trnar-ucu_20" "trnas-uga_23" "trnaq-cug_39"
## [2392] "trnas-aga_39" "trnan-guu_63" "trnas-uga_24"
## [2395] "trnaq-cug_40" "trnas-aga_40" "trnav-cac_68"
## [2398] "trnaq-uug_68" "trnaq-cug_41" "trnas-aga_41"
## [2401] "trnaq-uug_69" "trnaq-cug_42" "trnan-guu_64"
## [2404] "trnaq-cug_43" "trnas-uga_25" "trnas-uga_26"
## [2407] "trnas-aga_42" "trnan-guu_65" "trnav-aac_67"
## [2410] "trnaq-cug_44" "trnas-aga_43" "trnaq-uug_70"
## [2413] "trnan-guu_66" "trnas-aga_44" "trnac-gca_33"
## [2416] "trnam-cau_90" "trnar-ccu_61" "trnar-ucg_28"
## [2419] "trnap-agg_39" "trnam-cau_91" "trnaq-cug_45"
## [2422] "trnaq-uug_71" "trnas-aga_45" "trnar-ccg_6"
## [2425] "trnaa-cgc_29" "trnag-gcc_80" "trnat-cgu_40"
## [2428] "trnal-uag_32" "trnar-ucu_21" "trnar-ucu_22"
## [2431] "trnar-ucu_23" "trnar-ucu_24" "trnar-ucu_25"
## [2434] "trnar-ucu_26" "trnar-ucu_27" "trnar-ucu_28"
## [2437] "trnar-ucu_29" "trnar-ucu_30" "trnar-ucu_31"
## [2440] "trnar-ucu_32" "trnar-ucu_33" "trnar-ucu_34"
## [2443] "trnar-ucu_35" "trnar-ucu_36" "trnar-ucu_37"
## [2446] "trnar-ucu_38" "trnar-ucu_39" "trnar-ucu_40"
## [2449] "trnar-ucu_41" "trnar-ucu_42" "trnar-ucu_43"
## [2452] "trnar-ucu_44" "trnar-ucu_45" "trnar-ucu_46"
## [2455] "trnar-ucu_47" "trnar-ucu_48" "trnar-ucu_49"
## [2458] "trnak-uuu_76" "trnar-ucu_50" "trnar-ucu_51"
## [2461] "trnar-ucu_52" "trnar-ucu_53" "trnar-ucu_54"
## [2464] "trnar-ucu_55" "trnar-ucu_56" "trnar-ucu_57"
## [2467] "trnar-ucu_58" "trnar-ucu_59" "trnar-ucu_60"
## [2470] "trnar-ucu_61" "trnar-ucu_62" "trnar-ucu_63"
## [2473] "trnar-ucu_64" "trnar-ucu_65" "trnar-ucu_66"
## [2476] "trnar-ucu_67" "trnar-ucu_68" "trnar-ucu_69"
## [2479] "trnar-ucu_70" "trnar-ucu_71" "trnar-ucu_72"
## [2482] "trnar-ucu_73" "trnar-ucu_74" "trnar-ucu_75"
## [2485] "trnar-ucu_76" "trnar-ucu_77" "trnar-ucu_78"
## [2488] "trnar-ucu_79" "trnar-ucu_80" "trnar-ucu_81"
## [2491] "trnar-ucu_82" "trnar-ucu_83" "trnar-ucu_84"
## [2494] "trnar-ucu_85" "trnar-ucu_86" "trnar-ucu_87"
## [2497] "trnar-ucu_88" "trnar-ucu_89" "trnar-ucu_90"
## [2500] "trnar-ucu_91" "trnar-ucu_92" "trnai-uau_40"
## [2503] "trnak-uuu_77" "trnag-ucc_42" "trnac-gca_34"
## [2506] "trnac-gca_35" "trnac-gca_36" "trnac-gca_37"
## [2509] "trnac-gca_38" "trnac-gca_39" "trnac-gca_40"
## [2512] "trnaq-cug_46" "trnas-aga_46" "trnas-aga_47"
## [2515] "trnan-guu_67" "trnas-aga_48" "trnaq-uug_72"
## [2518] "trnav-cac_69" "trnaq-cug_47" "trnav-aac_68"
## [2521] "trnan-guu_68" "trnas-aga_49" "trnaq-uug_73"
## [2524] "trnav-cac_70" "trnaq-cug_48" "trnav-aac_69"
## [2527] "trnan-guu_69" "trnas-aga_50" "trnaq-uug_74"
## [2530] "trnav-cac_71" "trnaq-cug_49" "trnar-ucu_93"
## [2533] "trnan-guu_70" "trnas-uga_27" "trnas-uga_28"
## [2536] "trnas-aga_51" "trnan-guu_71" "trnav-aac_70"
## [2539] "trnaq-cug_50" "trnav-cac_72" "trnas-aga_52"
## [2542] "trnaq-uug_75" "trnan-guu_72" "trnac-gca_41"
## [2545] "trnac-gca_42" "trnac-gca_43" "trnam-cau_92"
## [2548] "trnar-ccu_62" "trnar-ucg_29" "trnam-cau_93"
## [2551] "trnaq-cug_51" "trnaq-uug_76" "trnas-aga_53"
## [2554] "trnar-ccg_7" "trnaa-cgc_30" "trnag-gcc_81"
## [2557] "trnat-cgu_41" "trnal-uag_33" "trnak-uuu_78"
## [2560] "trnag-ucc_43" "ccdc50.L_1" "unassigned_gene_1"
## [2563] "unassigned_gene_2" "unassigned_gene_3" "unassigned_gene_4"
## [2566] "unassigned_gene_5" "unassigned_gene_6" "unassigned_gene_7"
## [2569] "unassigned_gene_8" "unassigned_gene_9" "unassigned_gene_10"
## [2572] "unassigned_gene_11" "unassigned_gene_12" "unassigned_gene_13"
## [2575] "unassigned_gene_14" "unassigned_gene_15" "unassigned_gene_16"
## [2578] "unassigned_gene_17" "unassigned_gene_18" "unassigned_gene_19"
## [2581] "unassigned_gene_20" "unassigned_gene_21" "unassigned_gene_22"
## [2584] "unassigned_gene_23" "unassigned_gene_24"
Bunch of tRNAs. Please don’t stop here when the list is cut-off. Let’s check whether tRNA annotations are the only issues here:
setdiff(
grep("_", rownames(xenopus.data), value = T),
grep("^trna", rownames(xenopus.data), value = T)
)
## [1] "ccdc50.L_1" "unassigned_gene_1" "unassigned_gene_2"
## [4] "unassigned_gene_3" "unassigned_gene_4" "unassigned_gene_5"
## [7] "unassigned_gene_6" "unassigned_gene_7" "unassigned_gene_8"
## [10] "unassigned_gene_9" "unassigned_gene_10" "unassigned_gene_11"
## [13] "unassigned_gene_12" "unassigned_gene_13" "unassigned_gene_14"
## [16] "unassigned_gene_15" "unassigned_gene_16" "unassigned_gene_17"
## [19] "unassigned_gene_18" "unassigned_gene_19" "unassigned_gene_20"
## [22] "unassigned_gene_21" "unassigned_gene_22" "unassigned_gene_23"
## [25] "unassigned_gene_24"
There are some unassigned genes, but then there is one particular
gene ccdc50.L…
grep("ccdc50", rownames(xenopus.data), value = T)
## [1] "ccdc50.L" "ccdc50.S" "ccdc50.L_1"
grep("ccdc50", rownames(xenopus), value = T)
## [1] "ccdc50.L" "ccdc50.S" "ccdc50.L-1"
Check whether ccdc50.L is an important gene, or whether
ccdc50.L_1 is a real separate gene in Xenbase as you
learned on Friday.
We have checked that the up-to-date assembly contains the mitochondrial genome, and their mitochondrial genes are annotated (see previous section). The following is to trim down further to check out mRNAs that are poly adenylated.
gzcat XENLA_ncbi101.XB2023_04.gtf.gz | gawk '( $1 == "chrM" && $3 == "transcript" )' | grep -v tRNA
## chrM RefSeq transcript 2205 3023 . + . gene_id "unassigned_gene_2"; transcript_id "unassigned_transcript_2833"; gbkey "rRNA"; product "12S ribosomal RNA"; transcript_biotype "rRNA";
## chrM RefSeq transcript 3093 4723 . + . gene_id "unassigned_gene_4"; transcript_id "unassigned_transcript_2835"; gbkey "rRNA"; product "16S ribosomal RNA"; transcript_biotype "rRNA";
## chrM RefSeq transcript 4799 5770 . + . gene_id "ND1"; transcript_id "unassigned_transcript_2837"; gbkey "mRNA"; gene "ND1"; transcript_biotype "mRNA";
## chrM RefSeq transcript 5979 7016 . + . gene_id "ND2"; transcript_id "unassigned_transcript_2841"; gbkey "mRNA"; gene "ND2"; transcript_biotype "mRNA";
## chrM RefSeq transcript 7397 8951 . + . gene_id "COX1"; transcript_id "unassigned_transcript_2847"; gbkey "mRNA"; gene "COX1"; transcript_biotype "mRNA";
## chrM RefSeq transcript 9109 9796 . + . gene_id "COX2"; transcript_id "unassigned_transcript_2850"; gbkey "mRNA"; gene "COX2"; transcript_biotype "mRNA";
## chrM RefSeq transcript 9873 10040 . + . gene_id "ATP8"; transcript_id "unassigned_transcript_2852"; gbkey "mRNA"; gene "ATP8"; transcript_biotype "mRNA";
## chrM RefSeq transcript 10031 10711 . + . gene_id "ATP6"; transcript_id "unassigned_transcript_2853"; gbkey "mRNA"; gene "ATP6"; transcript_biotype "mRNA";
## chrM RefSeq transcript 10711 11491 . + . gene_id "COX3"; transcript_id "unassigned_transcript_2854"; gbkey "mRNA"; gene "COX3"; transcript_biotype "mRNA";
## chrM RefSeq transcript 11562 11904 . + . gene_id "ND3"; transcript_id "unassigned_transcript_2856"; gbkey "mRNA"; gene "ND3"; transcript_biotype "mRNA";
## chrM RefSeq transcript 11974 12270 . + . gene_id "ND4L"; transcript_id "unassigned_transcript_2858"; gbkey "mRNA"; gene "ND4L"; transcript_biotype "mRNA";
## chrM RefSeq transcript 12264 13647 . + . gene_id "ND4"; transcript_id "unassigned_transcript_2859"; gbkey "mRNA"; gene "ND4"; transcript_biotype "mRNA";
## chrM RefSeq transcript 13855 15669 . + . gene_id "ND5"; transcript_id "unassigned_transcript_2863"; gbkey "mRNA"; gene "ND5"; transcript_biotype "mRNA";
## chrM RefSeq transcript 15665 16177 . - . gene_id "ND6"; transcript_id "unassigned_transcript_2864"; gbkey "mRNA"; gene "ND6"; transcript_biotype "mRNA";
## chrM RefSeq transcript 16249 17388 . + . gene_id "CYTB"; transcript_id "unassigned_transcript_2866"; gbkey "mRNA"; gene "CYTB"; transcript_biotype "mRNA";
There are ways to extract the set of gene names that are mitochondrial. One simple way is to manually type them:
mito.genes <- c("ND1", "ND2", "COX1", "COX2", "ATP8", "ATP6", "COX3", "ND3", "ND4L", "ND4", "ND5", "ND6", "CYTB")
mito.genes
## [1] "ND1" "ND2" "COX1" "COX2" "ATP8" "ATP6" "COX3" "ND3" "ND4L" "ND4"
## [11] "ND5" "ND6" "CYTB"
rownames(xenopus)[rownames(xenopus) %in% mito.genes]
## [1] "ND1" "ND2" "COX1" "COX2" "ATP8" "ATP6" "COX3" "ND3" "ND4L" "ND4"
## [11] "ND5" "ND6" "CYTB"
The other way is parsing them by code (which I will leave as a advanced Quiz, but we will come back in extracting ribosomal genes).
In any case, when you are compiling an information that is not computed, save it as a text table for your record.
save_table(
mito.genes,
glue::glue("{project.prefix}mito_genes"),
format="csv"
)
We can now set up the mitochondrial gene fraction.
xenopus[["percent.mt"]] <- Seurat::PercentageFeatureSet( xenopus, features = mito.genes )
xenopus@meta.data
For ribosomal genes, I have compiled the list from the UniProt database. Download the files.
read_tsv("uniprot-download_true_fields_accession_2Creviewed_2Cid_2Cprotein_nam-2023.02.20-13.44.52.34.tsv") %>% head()
## Rows: 530 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (6): Entry, Reviewed, Entry Name, Protein names, Gene Names, Organism
## dbl (1): Length
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
read_tsv("uniprot-download_true_fields_accession_2Creviewed_2Cid_2Cprotein_nam-2023.02.20-13.44.52.34.tsv") %>%
dplyr::count( `Reviewed` )
## Rows: 530 Columns: 7
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (6): Entry, Reviewed, Entry Name, Protein names, Gene Names, Organism
## dbl (1): Length
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# this takes out a lot o
provisional <-
bind_rows(
read_tsv("uniprot-download_true_fields_accession_2Creviewed_2Cid_2Cprotein_nam-2023.02.20-13.44.52.34.tsv", show_col_types = F),
) %>%
dplyr::select(`Gene Names`, `Organism` ) %>%
dplyr::filter( !is.na(`Gene Names`) ) %>%
separate_rows( `Gene Names`, convert = FALSE, sep = " " ) %>%
pull(`Gene Names` )
#dplyr::filter( grepl("14e22", `Gene Names` ) )
provisional
## [1] "rpsa.S" "37lrp" "lambr"
## [4] "lamr1" "LR" "lrp"
## [7] "p40" "RPSA" "rpsa"
## [10] "rpsa.L" "37LRP" "67LR"
## [13] "lambr" "LamR" "lamr1"
## [16] "LBP/p40" "LR" "lrp"
## [19] "LRP/LR" "p40" "RPSA"
## [22] "rpsa" "rpsa.S" "37lrp"
## [25] "lambr" "lamr1" "LR"
## [28] "lrp" "p40" "RPSA"
## [31] "rpsa" "LOC100037080" "RPSA"
## [34] "rps3-a" "rps6ka" "rps3-b"
## [37] "rpsa" "ogfod1" "impact"
## [40] "top2b.S" "top2b.L" "rps6ka6.S"
## [43] "pp90rsk4" "rps6ka" "rps6ka1"
## [46] "rps6ka6" "rsk4" "lonp1.L"
## [49] "LONP1" "rps6ka1.L" "hu-1"
## [52] "mapkapk1a" "rps6ka1" "rsk"
## [55] "rsk1" "rps6ka4.L" "rps6ka4"
## [58] "top2b.S" "top2b.L" "top2b.S"
## [61] "top2b.S" "rps6ka1.L" "hu-1"
## [64] "mapkapk1a" "rps6ka1" "rsk"
## [67] "rsk1" "rps6ka1.L" "hu-1"
## [70] "mapkapk1a" "rps6ka1" "rsk"
## [73] "rsk1" "rps6ka1.L" "hu-1"
## [76] "mapkapk1a" "rps6ka1" "rsk"
## [79] "rsk1" "rps6ka4.L" "rps6ka4"
## [82] "gfm1" "efg1" "rpl4-b"
## [85] "rpl1b" "rpl18-b" "rpl14b"
## [88] "rpl5-a" "rpl5-b" "lonp2"
## [91] "ptcd3" "snu13" "nhp2l1"
## [94] "hsp90ab1" "hsp90beta" "XELAEV_18028538mg"
## [97] "riox2.L" "mina" "mina-prov"
## [100] "mina.L" "NO52" "riox2"
## [103] "nsa2.L" "cdk105" "hcl-g1"
## [106] "hclg1" "hussy-29" "hussy29"
## [109] "nsa2" "tinp1" "yr-29"
## [112] "mrpl24" "rps6ka1.L" "hu-1"
## [115] "mapkapk1a" "MGC81220" "rps6ka1"
## [118] "rsk" "rsk1" "top2a.L"
## [121] "LOC398512" "top2" "top2a"
## [124] "tp2a" "snu13.L" "fa-1"
## [127] "fa1" "hoip" "nhp2l1"
## [130] "nhp2l1-b" "nhpx" "otk27"
## [133] "snrnp15-5" "snu13" "spag12"
## [136] "ssfa1" "rpls3-b" "rps6ka3.L"
## [139] "p90" "rps6ka3" "rsk"
## [142] "RSK2" "rsk2" "S6KII"
## [145] "rps6kb1.S" "p70-alpha" "p70-s6k"
## [148] "p70s6k" "ps6k" "rps6kb1"
## [151] "rps6kb1-A" "s6k" "s6K1"
## [154] "stk14a" "rps27.L" "pms1.S"
## [157] "pms1" "LOC108703225" "eftud2.L"
## [160] "LOC108700925" "rpl27.L" "LOC108700150"
## [163] "rack1.L" "gnb2-rs1" "gnb2l1"
## [166] "h12.3" "hlc-7" "pig21"
## [169] "rack1.S" "mrps18b.L" "mlh3.L"
## [172] "mlh3" "rps6ka5.L" "mrpl24.L"
## [175] "L24" "mrpl24" "rpl24"
## [178] "rpl24-A" "mrto4.S" "rps9.L"
## [181] "rpl15.S" "LOC121394610" "rps20.L"
## [184] "rps20" "pnpt1.S" "LOC108717861"
## [187] "rps7.S" "dba8" "rps7"
## [190] "rps7.L" "rpS8A" "rpS8B"
## [193] "rps27a.L" "eprs1.L" "eprs"
## [196] "mrpl49.L" "mvd.L" "qars1.L"
## [199] "nhp2.S" "nhp2.L" "LOC108710993"
## [202] "rps27l.L" "mrpl30.S" "rps26.S"
## [205] "riox2.L" "mina" "mina.L"
## [208] "NO52" "rpl8.L" "mrps15.L"
## [211] "rps6kb1.L" "rps15.S" "rig"
## [214] "rps15" "mrpl35.L" "rpl34.L"
## [217] "L34" "rpl34" "xl34"
## [220] "mvk.L" "rplp0.L" "rpl37.L"
## [223] "gfm2.L" "EFG2" "GFM2"
## [226] "rpl17.L" "b" "l17"
## [229] "pd-1" "rpl17" "rpl17-a"
## [232] "rpl17-b" "rpl23" "rps19.S"
## [235] "rps19" "rpl11.S" "rpl11"
## [238] "rpl11.L" "mlh3.L" "mlh3"
## [241] "rps25.L" "rpl35.L" "l35"
## [244] "rpl35" "LOC108698757" "mlh3.L"
## [247] "mlh3" "mlh3.L" "mlh3"
## [250] "rpl12.L" "LOC108698451" "LOC108698451"
## [253] "mrps18b.L" "LOC108700150" "MGC114789"
## [256] "LOC108703484" "pms2.L" "rpl6.S"
## [259] "rpl6" "rpl6.S" "rpl6"
## [262] "rps15a.S" "rps15a" "rps22"
## [265] "LOC108706570" "rpl6.S" "rpl6"
## [268] "rps27l.L" "rps26.S" "mrps31.S"
## [271] "rps27l.S" "eprs1.L" "eprs"
## [274] "rps6kc1.L" "rpk118" "rps6kc1"
## [277] "s6pkh1" "mrps18a.L" "mrp-s18-3"
## [280] "mrps18-3" "mrps18a" "s18bmt"
## [283] "LOC108717877" "hsp90ab1.S" "hsp90ab1"
## [286] "hsp90b" "hsp90beta" "mlh1.S"
## [289] "mlh1" "mutL" "eprs1.L"
## [292] "eprs" "eprs1.L" "eprs"
## [295] "eprs1.L" "eprs" "LOC121393781"
## [298] "LOC121393773" "eprs1.S" "eprs"
## [301] "eprs.S" "rps12.S" "rps12"
## [304] "rps12-a" "rps12-b" "rps12b"
## [307] "eprs1.S" "eprs" "eprs.S"
## [310] "rps24.L" "rps24" "rps24.L"
## [313] "LOC121393051" "rps24" "LOC121393051"
## [316] "rps24" "rps24.L" "LOC121393051"
## [319] "rps24" "rps24.L" "LOC121393051"
## [322] "rps15.S" "rig" "rps15"
## [325] "rps27.L" "mrpl9.S" "comp72"
## [328] "l9mt" "mrpl9" "LOC108700925"
## [331] "nsa2.S" "mrps31.L" "rps6ka3.S"
## [334] "rps6ka3.S" "rps27l.L" "mlh1.S"
## [337] "LOC100036779" "mlh1" "mutL"
## [340] "mrps36.S" "dc47" "LOC100037103"
## [343] "mrp-s36" "mrps36" "mrps36.L"
## [346] "LOC100037089" "rpl7a.L" "rpl7a"
## [349] "XB5843130.S" "LOC100049136" "XB5843130"
## [352] "LOC100101334" "rps3a-b" "LOC100126642"
## [355] "mrpl4.L" "cgi-28" "l4mt"
## [358] "mrpl4" "LOC100127328" "LOC100037086"
## [361] "ffcskk.L" "fcsk" "fuk"
## [364] "fuk.L" "LOC100137687" "eprs1.S"
## [367] "eprs" "eprs.S" "eprs1"
## [370] "LOC100158438" "ipo7.L" "imp7"
## [373] "ipo7" "MGC52556" "ranbp7"
## [376] "rps7" "rps8" "rps24"
## [379] "rpl35a" "rpl4-a" "rpl-4"
## [382] "rpl1a" "rpl18-a" "rpl14a"
## [385] "rps15" "rig" "rps20"
## [388] "rps6" "rps11" "rpl8"
## [391] "rpl28" "rpl27a" "rpl22"
## [394] "rps12" "rps27" "rps13"
## [397] "rps4" "rps4x" "rpl21"
## [400] "rpl26" "rpl22" "mrpl45"
## [403] "rps10" "mrps16.L" "LOC100158393"
## [406] "mrps16" "mrps9.L" "mrps9"
## [409] "mrpl16.S" "mrpl16" "mrpl16.L"
## [412] "rplp2.L" "MGC154377" "rplp2"
## [415] "LOC734179" "mrpl23.S" "MGC131313"
## [418] "mrpl23" "mrps24-b" "mrps18b.S"
## [421] "MGC130639" "mrp-s18-2" "mrps18-2"
## [424] "mrps18b" "ptd017" "s18amt"
## [427] "SCL75" "rpl39.S" "MGC116452"
## [430] "MGC116477" "rpl39" "rpl39-a"
## [433] "rpl39-b" "rpl39.L" "rpl39a"
## [436] "rpl39b" "rpl35.L" "l35"
## [439] "MGC116425" "rpl35" "fau.L"
## [442] "fau" "MGC116435" "rps29.S"
## [445] "MGC114875" "rps29" "rps15a.S"
## [448] "MGC114789" "MGC130892" "rps15a"
## [451] "rps22" "mrpl52" "mrps18a.L"
## [454] "MGC115435" "mrp-s18-3" "mrps18-3"
## [457] "mrps18a" "s18bmt" "MGC114621"
## [460] "MGC98504" "LOC733189" "MGC115171"
## [463] "mrpl40.S" "LOC496101" "mrpl40"
## [466] "mrps11.L" "LOC495996" "mrps11"
## [469] "mrpl9.S" "comp72" "l9mt"
## [472] "LOC495474" "mrpl9" "mrpl20"
## [475] "mrpl12.L" "LOC495364" "mrpl12"
## [478] "mrps33.L" "LOC495310" "mrps33"
## [481] "mrpl3.S" "LOC494992" "mrpl3"
## [484] "rpl3l.L" "LOC494722" "rpl3l"
## [487] "mrpl2.L" "cgi-22" "MGC84466"
## [490] "mrp-l14" "mrpl2" "rpml14"
## [493] "rps28p9.L" "MGC85550" "rps28p9"
## [496] "rps28p9.S" "rpl36" "rpl36a.L"
## [499] "LOC108700056" "MGC85428" "rpl36a"
## [502] "rpl36a.S" "rpl38.L" "MGC85404"
## [505] "rpl38" "rpl38.S" "rpl28.S"
## [508] "l28" "LOC100101273" "rpl28"
## [511] "rpl28-a" "rpl28-b" "rpl28.L"
## [514] "rpl29.S" "MGC85384" "rpl29"
## [517] "rpl29-a" "rpl29-b" "rpl29b"
## [520] "rpl23a.S" "l23a" "mda20"
## [523] "MGC85348" "rpl23a" "rpl23a.L"
## [526] "rpl11.S" "MGC85310" "rpl11"
## [529] "rpl11.L" "mrpl51" "mrpl28"
## [532] "rps21" "rps26.L" "MGC86356"
## [535] "rps26" "rps23.S" "MGC86316"
## [538] "rps23" "rps23.L" "mrpl15"
## [541] "rpl17.S" "l17" "MGC78885"
## [544] "pd-1" "rpl17" "rpl17-a"
## [547] "rpl17-b" "rpl17a" "rpl23"
## [550] "yy1.L" "FIII" "ino80s"
## [553] "nf-e1" "ucrbp" "xyy1"
## [556] "yin-yang-1" "yy1" "yy1-a"
## [559] "yy1-b" "mrps7" "rpl6.S"
## [562] "MGC84358" "rpl6" "rpl19.S"
## [565] "rpl19" "rpl19-prov" "rps8.L"
## [568] "MGC83421" "rps8" "mrpl41-a"
## [571] "LOC398653" "rps25.L" "MGC82151"
## [574] "rps25" "rps25.S" "rps20.S"
## [577] "MGC82136" "rpl13.S" "rpl13"
## [580] "rpl13-prov" "mrpl17.S" "MGC83084"
## [583] "mrpl17" "rps27a.S" "MGC81889"
## [586] "rps27a" "rps27a.L" "rpl37.S"
## [589] "MGC82973" "rpl37" "rpl30.L"
## [592] "l30" "MGC82844" "rpl30"
## [595] "rpl30-a" "rpl30-b" "rps17.L"
## [598] "MGC82841" "rps17" "rps17.S"
## [601] "rpl23.S" "LOC108700787" "MGC82808"
## [604] "rpl23" "galk1.L" "galk1"
## [607] "MGC82807" "rps6kb1-A" "rps9.S"
## [610] "MGC80804" "rps9" "rps9.L"
## [613] "mlh3.L" "MGC80774" "mlh3"
## [616] "MGC80700" "rpl7a.S" "MGC80199"
## [619] "rpl7a" "surf3" "trup"
## [622] "rplp2.S" "MGC80163" "uba52.L"
## [625] "LOC108706905" "MGC80109" "uba52"
## [628] "exosc9.L" "exosc9" "SCL75"
## [631] "scl75" "mrpl41-b" "rpl14.S"
## [634] "MGC83076" "rpl14" "mrps24-a"
## [637] "rps18.L" "LOC108700218" "MGC82306"
## [640] "rps18" "mrps26.L" "MGC82245"
## [643] "mrp-s13" "mrp-s26" "mrps13"
## [646] "mrps26" "rpms13" "nhp2"
## [649] "nola2" "rps6kc1.L" "MGC81290"
## [652] "rpk118" "rps6kc1" "s6pkh1"
## [655] "rpl31" "rplp1.L" "MGC68562"
## [658] "rplp1" "hsp90b1.S" "ecgp"
## [661] "gp96" "grp94" "hsp90b1"
## [664] "MGC68448" "tra1" "qars1.S"
## [667] "MGC69128" "qars" "qars.S"
## [670] "qars1" "rps12.S" "MGC68529"
## [673] "rps12" "rps12-a" "rps12-b"
## [676] "rps12b" "rpl27.S" "rpl27"
## [679] "rps19.S" "rps19" "rps14.S"
## [682] "rps14" "rps14-prov" "rps14.L"
## [685] "rpl4.S" "rpl4" "rpl4-b"
## [688] "rps8" "mrpl44.S" "mrpl44"
## [691] "rps11.L" "LOC108702869" "rps11"
## [694] "rpl28.L" "l28" "rpl28"
## [697] "rpl28-a" "rpl28-b" "rpl28.S"
## [700] "rpl18.S" "MGC64315" "rpl14a"
## [703] "rpl18" "rpl18-a" "rpl18-b"
## [706] "rpl18.L" "rpl29.L" "rpl29"
## [709] "rpl29-a" "rpl29-b" "rpl29a"
## [712] "rpl18.L" "L14B" "rpl14b"
## [715] "rpl18" "rpl18-a" "rpl18-b"
## [718] "rpl18.S" "rpl35a.S" "LOC121393140"
## [721] "rpl35a" "rpl35a.L" "rpl27a.L"
## [724] "LOC121400351" "rpl27a" "rpl21.L"
## [727] "rpl21" "rpl37a" "LOC398653"
## [730] "rpl18a.S" "MGC64263" "rpl18a"
## [733] "rpl30.S" "l30" "rpl30"
## [736] "rpl30-a" "rpl30-b" "rps13.L"
## [739] "rps13" "mrto4.L" "mrt4"
## [742] "mrto4" "rps2.L" "rps2"
## [745] "rps2e" "hsp90b1.L" "ecgp"
## [748] "gp96" "grp94" "hsp90b1"
## [751] "tra1" "rpl9.L" "rpl9"
## [754] "rpl15.L" "rpl15" "rpl10.S"
## [757] "rpl10" "rpl10.L" "eef2.1.L"
## [760] "eef-2" "eef2" "eef2.1"
## [763] "ef2" "LOC398512" "pms1.S"
## [766] "pms1" "eftud2.S" "eftud2"
## [769] "snrp116" "snu114" "rps12.L"
## [772] "rps12" "rps12-a" "rps12-b"
## [775] "rps12a" "rpl13a.L" "rpl13a"
## [778] "rpl17.L" "RPL17" "rpl17"
## [781] "ckm.L" "ckm" "ckm.S"
## [784] "ckmm" "m-ck" "rpl3.L"
## [787] "rpl3" "rpl19" "rpl10a"
## [790] "rps6.L" "rps6" "rps6-a"
## [793] "rps6-b" "rps6b" "rps9"
## [796] "rps3a-a" "rplp0.S" "arbp"
## [799] "l10e" "lp0" "prlp0"
## [802] "rplp0" "rpp0" "rpl5.L"
## [805] "rpl5" "rpl5-a" "rpl5-b"
## [808] "rack1.L" "gnb2-rs1" "gnb2l1"
## [811] "h12.3" "hlc-7" "LOC446289"
## [814] "pig21" "rack1" "rack1.S"
## [817] "exosc8.S" "exosc8" "exosc8.L"
## [820] "MGC52847" "rpl12.S" "rpl12"
## [823] "rpl5.S" "rpl5" "rpl5-a"
## [826] "rpl5-b" "rpl34.S" "L34"
## [829] "rpl34" "rpl34.L" "XL34"
## [832] "xl34" "yy1.L" "FIII"
## [835] "ino80s" "nf-e1" "ucrbp"
## [838] "xyy1" "yin-yang-1" "yy1"
## [841] "yy1-a" "yy1-b" "mrpl10.S"
## [844] "mrps7.S" "rpl13a.S" "trap1.S"
## [847] "galk1.L" "galk1" "mrpl28.L"
## [850] "mrpl28" "rsl1d1.L" "rsl1d1"
## [853] "trap1.L" "mrps10.S" "mrps12.S"
## [856] "mrps21.S" "rpl10.L" "rps5.L"
## [859] "exosc5.L" "rps6kl1.L" "hsp90aa1.1.L"
## [862] "LOC108698781" "mrps21.L" "LOC108697549"
## [865] "LOC108697688" "mrpl51.L" "XB5896631.L"
## [868] "rps19.L" "mrpl32.S" "LOC108695345"
## [871] "mrpl55.L" "mrpl3.L" "rps6kc1.S"
## [874] "mrpl57.S" "mrpl33.L" "mrpl14.L"
## [877] "mrps5.L" "fau.S" "mrpl11.S"
## [880] "mrps14.S" "imp3.S" "LOC108715766"
## [883] "LOC108715857" "mrps25.S" "exosc6.L"
## [886] "rps19bp1.L" "LOC108714734" "mrps33.S"
## [889] "mrpl46.S" "rps16.S" "mrpl42.S"
## [892] "rpl26.S" "mrps11.L" "mrps11"
## [895] "mrpl46.L" "rpl37a.L" "LOC108711439"
## [898] "LOC108712865" "rpl23a.L" "mrps6.S"
## [901] "rpl24.S" "mrps31.S" "mrpl48.S"
## [904] "mrpl48" "mrpl39.L" "LOC108707957"
## [907] "rpl11.L" "rpl24.L" "rpl24"
## [910] "mrps31.L" "mrpl54.S" "mrps18c.S"
## [913] "mrpl52.S" "mrpl52" "mrpl1.L"
## [916] "LOC108712230" "LOC108713416" "mrpl52.L"
## [919] "rpl6.L" "eef2.2.L" "MGC68699"
## [922] "mrps18c.L" "mrpl38.L" "mrpl38"
## [925] "mrpl51.L" "XB5896631.L" "LOC108698781"
## [928] "LOC108698235" "LOC108700022" "mrpl1.S"
## [931] "LOC108700022" "rsl1d1.S" "mrpl27.L"
## [934] "rsl1d1.S" "rpl3l.S" "mrpl48.S"
## [937] "mrpl48" "rps10.L" "rps10"
## [940] "LOC108707348" "eef2.1.S" "LOC108708365"
## [943] "mrpl42.L" "mrpl48.S" "mrpl48"
## [946] "galk2.L" "galk2" "rps14.L"
## [949] "mrpl42.L" "rpl18a.L" "galk2.L"
## [952] "galk2" "mrpl1.L" "LOC108710319"
## [955] "mrps11.S" "mrpl52.L" "mrps25.S"
## [958] "rpl7l1.L" "rpl7l1" "mrps27.L"
## [961] "mrpl47.L" "mrpl47.L" "mrpl3.S"
## [964] "mrpl3" "mrpl57.S" "mrpl19.S"
## [967] "mrpl3.L" "LOC108715263" "LOC108716102"
## [970] "LOC108716102" "mrps22.L" "mrps22.L"
## [973] "LOC108717404" "LOC108705784" "LOC108717492"
## [976] "LOC121394503" "mrpl51.L" "LOC121395362"
## [979] "LOC121396125" "LOC108705858" "LOC121396648"
## [982] "rpl36a.L" "rpl36a" "rpl36a.S"
## [985] "LOC121396642" "exosc5.L" "dap3.L"
## [988] "dap3.L" "LOC108700056" "rsl1d1.L"
## [991] "rsl1d1" "rsl1d1.L" "rsl1d1"
## [994] "mrpl45.L" "mrlp45" "mrpl45"
## [997] "LOC108706079" "mrps18c.S" "LOC121400386"
## [1000] "LOC121400621" "LOC121400576" "LOC121400621"
## [1003] "LOC108709120" "mrps11.L" "mrps11"
## [1006] "mrpl22.L" "LOC108712266" "LOC108712865"
## [1009] "LOC108712266" "LOC108712865" "LOC108712883"
## [1012] "LOC108712266" "rps13.L" "rps13"
## [1015] "LOC121402816" "LOC121402815" "LOC121393045"
## [1018] "mrpl1.L" "rpl22l1.L" "LOC100037062"
## [1021] "rpl22l1" "LOC100037111" "mrpl37.L"
## [1024] "LOC733390" "mrpl37" "LOC100037086"
## [1027] "LOC100037184" "PDCD9" "LOC100049095"
## [1030] "mrps2.L" "LOC443704" "mrps2"
## [1033] "mrps25.L" "LOC100049746" "mrps25"
## [1036] "LOC100126614" "mrpl32.L" "LOC100126616"
## [1039] "mrpl32" "LOC100127333" "mrps35.L"
## [1042] "LOC100127340" "mrps35" "PDCD9"
## [1045] "rsl1d1.L" "LOC100137646" "rsl1d1"
## [1048] "rps10.L" "LOC445824" "rps10"
## [1051] "mrps2.S" "LOC733351" "LOC733390"
## [1054] "mrps30.L" "MGC131350" "mrps30"
## [1057] "LOC733400" "LOC733390" "rpl7.L"
## [1060] "MGC130910" "rpl7" "LOC733385"
## [1063] "mrpl21.L" "MGC131341" "mrpl21"
## [1066] "LOC733351" "rps16" "LOC446962"
## [1069] "rpl34" "LOC733302" "LOC734162"
## [1072] "mrpl11.L" "LOC496259" "mrpl11"
## [1075] "mrpl13.S" "LOC496258" "mrpl13"
## [1078] "exosc8.L" "LOC496046" "exosc4.S"
## [1081] "exosc4" "LOC495942" "LOC495666"
## [1084] "rpl7l1.L" "LOC495349" "rpl7l1"
## [1087] "mrpl48.S" "LOC495212" "mrpl48"
## [1090] "LOC446962" "mrpl53.L" "MGC85354"
## [1093] "mrpl53" "rpl32.L" "MGC85374"
## [1096] "rpl32" "rpl24.L" "MGC85232"
## [1099] "rpl24" "MGC84749" "eft-2-prov"
## [1102] "exosc7.S" "exosc7" "exosc7-prov"
## [1105] "mrpl1.S" "mrpl1" "mrpl1-prov"
## [1108] "rpl26.L" "LOC443704" "LOC445824"
## [1111] "rsl24d1.L" "MGC81028" "rlp24"
## [1114] "rpl24" "rpl24l" "rsl24d1"
## [1117] "rvas3" "mrpl11.S" "MGC82344"
## [1120] "mrpl11" "imp3.L" "imp3"
## [1123] "MGC81216" "rps16.L" "MGC80065"
## [1126] "rps16" "rps5.S" "rps5"
## [1129] "mrps17.L" "mrps17" "mrps12.L"
## [1132] "mrps12" "rps10.S" "rps10"
## [1135] "LOC398682" "mrpl18.L" "mrpl18"
## [1138] "mrps34.S" "mrps34" "galk2.L"
## [1141] "galk2" "mrpl43.L" "mrpl43"
## [1144] "rsl24d1.S" "rlp24" "rpl24"
## [1147] "rpl24l" "rsl24d1" "rvas3"
## [1150] "RPL18A" "mrps30.S" "mrps30"
## [1153] "PDCD9" "pdcd9" "S14"
## [1156] "RACK1"
provisional %>% length()
## [1] 1156
intersect( rownames(xenopus), provisional ) %>% length()
## [1] 309
# intersect( gene.list.frog, provisional )
# Potential genes that might be missed
potential <- setdiff( provisional, rownames(xenopus) )
potential
## [1] "37lrp" "lambr" "lamr1"
## [4] "LR" "lrp" "p40"
## [7] "RPSA" "rpsa" "37LRP"
## [10] "67LR" "LamR" "LBP/p40"
## [13] "LRP/LR" "LOC100037080" "rps3-a"
## [16] "rps6ka" "rps3-b" "ogfod1"
## [19] "impact" "pp90rsk4" "rps6ka1"
## [22] "rps6ka6" "rsk4" "LONP1"
## [25] "hu-1" "mapkapk1a" "rsk"
## [28] "rsk1" "rps6ka4" "gfm1"
## [31] "efg1" "rpl4-b" "rpl1b"
## [34] "rpl18-b" "rpl14b" "rpl5-a"
## [37] "rpl5-b" "lonp2" "ptcd3"
## [40] "snu13" "nhp2l1" "hsp90ab1"
## [43] "hsp90beta" "XELAEV_18028538mg" "mina"
## [46] "mina-prov" "mina.L" "NO52"
## [49] "riox2" "cdk105" "hcl-g1"
## [52] "hclg1" "hussy-29" "hussy29"
## [55] "nsa2" "tinp1" "yr-29"
## [58] "mrpl24" "MGC81220" "LOC398512"
## [61] "top2" "top2a" "tp2a"
## [64] "fa-1" "fa1" "hoip"
## [67] "nhp2l1-b" "nhpx" "otk27"
## [70] "snrnp15-5" "spag12" "ssfa1"
## [73] "rpls3-b" "p90" "rps6ka3"
## [76] "RSK2" "rsk2" "S6KII"
## [79] "p70-alpha" "p70-s6k" "p70s6k"
## [82] "ps6k" "rps6kb1" "rps6kb1-A"
## [85] "s6k" "s6K1" "stk14a"
## [88] "pms1" "gnb2-rs1" "gnb2l1"
## [91] "h12.3" "hlc-7" "pig21"
## [94] "mlh3" "L24" "rpl24"
## [97] "rpl24-A" "rps20" "dba8"
## [100] "rps7" "rpS8A" "rpS8B"
## [103] "eprs" "nhp2.L" "rig"
## [106] "rps15" "L34" "rpl34"
## [109] "xl34" "EFG2" "GFM2"
## [112] "b" "l17" "pd-1"
## [115] "rpl17" "rpl17-a" "rpl17-b"
## [118] "rpl23" "rps19" "rpl11"
## [121] "l35" "rpl35" "rpl6"
## [124] "rps15a" "rps22" "rpk118"
## [127] "rps6kc1" "s6pkh1" "mrp-s18-3"
## [130] "mrps18-3" "mrps18a" "s18bmt"
## [133] "LOC108717877" "hsp90b" "mlh1"
## [136] "mutL" "eprs.S" "rps12"
## [139] "rps12-a" "rps12-b" "rps12b"
## [142] "rps24" "comp72" "l9mt"
## [145] "mrpl9" "LOC100036779" "dc47"
## [148] "LOC100037103" "mrp-s36" "mrps36"
## [151] "mrps36.L" "LOC100037089" "rpl7a"
## [154] "LOC100049136" "XB5843130" "LOC100101334"
## [157] "rps3a-b" "LOC100126642" "cgi-28"
## [160] "l4mt" "mrpl4" "LOC100127328"
## [163] "LOC100037086" "fcsk" "fuk"
## [166] "fuk.L" "LOC100137687" "eprs1"
## [169] "LOC100158438" "imp7" "ipo7"
## [172] "MGC52556" "ranbp7" "rps8"
## [175] "rpl35a" "rpl4-a" "rpl-4"
## [178] "rpl1a" "rpl18-a" "rpl14a"
## [181] "rps6" "rps11" "rpl8"
## [184] "rpl28" "rpl27a" "rpl22"
## [187] "rps27" "rps13" "rps4"
## [190] "rps4x" "rpl21" "rpl26"
## [193] "mrpl45" "rps10" "LOC100158393"
## [196] "mrps16" "mrps9" "mrpl16"
## [199] "mrpl16.L" "MGC154377" "rplp2"
## [202] "LOC734179" "MGC131313" "mrpl23"
## [205] "mrps24-b" "MGC130639" "mrp-s18-2"
## [208] "mrps18-2" "mrps18b" "ptd017"
## [211] "s18amt" "SCL75" "MGC116452"
## [214] "MGC116477" "rpl39" "rpl39-a"
## [217] "rpl39-b" "rpl39a" "rpl39b"
## [220] "MGC116425" "fau" "MGC116435"
## [223] "MGC114875" "rps29" "MGC130892"
## [226] "mrpl52" "MGC115435" "MGC98504"
## [229] "LOC733189" "MGC115171" "LOC496101"
## [232] "mrpl40" "LOC495996" "mrps11"
## [235] "LOC495474" "mrpl20" "LOC495364"
## [238] "mrpl12" "LOC495310" "mrps33"
## [241] "LOC494992" "mrpl3" "rpl3l.L"
## [244] "LOC494722" "rpl3l" "cgi-22"
## [247] "MGC84466" "mrp-l14" "mrpl2"
## [250] "rpml14" "MGC85550" "rps28p9"
## [253] "rpl36" "MGC85428" "rpl36a"
## [256] "rpl36a.S" "MGC85404" "rpl38"
## [259] "l28" "LOC100101273" "rpl28-a"
## [262] "rpl28-b" "MGC85384" "rpl29"
## [265] "rpl29-a" "rpl29-b" "rpl29b"
## [268] "l23a" "mda20" "MGC85348"
## [271] "rpl23a" "MGC85310" "mrpl51"
## [274] "mrpl28" "rps21" "MGC86356"
## [277] "rps26" "MGC86316" "rps23"
## [280] "mrpl15" "MGC78885" "rpl17a"
## [283] "FIII" "ino80s" "nf-e1"
## [286] "ucrbp" "xyy1" "yin-yang-1"
## [289] "yy1" "yy1-a" "yy1-b"
## [292] "mrps7" "MGC84358" "rpl19.S"
## [295] "rpl19" "rpl19-prov" "MGC83421"
## [298] "mrpl41-a" "MGC82151" "rps25"
## [301] "MGC82136" "rpl13" "rpl13-prov"
## [304] "MGC83084" "mrpl17" "MGC81889"
## [307] "rps27a" "MGC82973" "rpl37"
## [310] "l30" "MGC82844" "rpl30"
## [313] "rpl30-a" "rpl30-b" "MGC82841"
## [316] "rps17" "MGC82808" "galk1"
## [319] "MGC82807" "MGC80804" "MGC80774"
## [322] "MGC80199" "surf3" "trup"
## [325] "MGC80163" "MGC80109" "uba52"
## [328] "exosc9" "scl75" "mrpl41-b"
## [331] "MGC83076" "rpl14" "mrps24-a"
## [334] "MGC82306" "rps18" "MGC82245"
## [337] "mrp-s13" "mrp-s26" "mrps13"
## [340] "mrps26" "rpms13" "nhp2"
## [343] "nola2" "MGC81290" "rpl31"
## [346] "MGC68562" "rplp1" "ecgp"
## [349] "gp96" "grp94" "hsp90b1"
## [352] "MGC68448" "tra1" "MGC69128"
## [355] "qars" "qars.S" "qars1"
## [358] "MGC68529" "rpl27" "rps14"
## [361] "rps14-prov" "rpl4" "mrpl44"
## [364] "MGC64315" "rpl18" "rpl29a"
## [367] "L14B" "rpl35a.L" "rpl37a"
## [370] "MGC64263" "rpl18a" "mrt4"
## [373] "mrto4" "rps2" "rps2e"
## [376] "rpl9" "rpl15" "rpl10"
## [379] "eef-2" "eef2" "eef2.1"
## [382] "ef2" "eftud2" "snrp116"
## [385] "snu114" "rps12a" "rpl13a"
## [388] "RPL17" "ckm" "ckm.S"
## [391] "ckmm" "m-ck" "rpl3"
## [394] "rpl10a" "rps6-a" "rps6-b"
## [397] "rps6b" "rps3a-a" "arbp"
## [400] "l10e" "lp0" "prlp0"
## [403] "rplp0" "rpp0" "rpl5"
## [406] "LOC446289" "rack1" "exosc8"
## [409] "MGC52847" "rpl12" "XL34"
## [412] "rsl1d1" "rps6kl1.L" "hsp90aa1.1.L"
## [415] "mrps14.S" "rpl37a.L" "mrpl48"
## [418] "LOC108707957" "eef2.2.L" "MGC68699"
## [421] "mrpl38" "LOC108698235" "rpl3l.S"
## [424] "LOC108707348" "galk2" "LOC108710319"
## [427] "rpl7l1" "LOC108715263" "LOC121395362"
## [430] "LOC121396125" "LOC108705858" "LOC121396648"
## [433] "LOC121396642" "mrlp45" "LOC108706079"
## [436] "LOC121400386" "LOC121400621" "LOC121400576"
## [439] "LOC108709120" "LOC108712266" "LOC121402816"
## [442] "LOC121402815" "LOC100037062" "rpl22l1"
## [445] "LOC100037111" "LOC733390" "mrpl37"
## [448] "LOC100037184" "PDCD9" "LOC100049095"
## [451] "LOC443704" "mrps2" "LOC100049746"
## [454] "mrps25" "LOC100126614" "LOC100126616"
## [457] "mrpl32" "LOC100127333" "LOC100127340"
## [460] "mrps35" "LOC100137646" "LOC445824"
## [463] "LOC733351" "MGC131350" "mrps30"
## [466] "LOC733400" "MGC130910" "rpl7"
## [469] "LOC733385" "MGC131341" "mrpl21"
## [472] "rps16" "LOC446962" "LOC733302"
## [475] "LOC734162" "LOC496259" "mrpl11"
## [478] "LOC496258" "mrpl13" "LOC496046"
## [481] "exosc4" "LOC495942" "LOC495666"
## [484] "LOC495349" "LOC495212" "MGC85354"
## [487] "mrpl53" "MGC85374" "rpl32"
## [490] "MGC85232" "MGC84749" "eft-2-prov"
## [493] "exosc7" "exosc7-prov" "mrpl1"
## [496] "mrpl1-prov" "MGC81028" "rlp24"
## [499] "rpl24l" "rsl24d1" "rvas3"
## [502] "MGC82344" "imp3" "MGC81216"
## [505] "MGC80065" "rps5" "mrps17"
## [508] "mrps12" "LOC398682" "mrpl18"
## [511] "mrps34" "mrpl43" "RPL18A"
## [514] "pdcd9" "S14" "RACK1"
provisional2 <- intersect( c( paste0( potential, ".L" ), paste0( potential, ".S" )), rownames(xenopus))
provisional2
## [1] "rpsa.L" "ogfod1.L" "impact.L" "rps6ka1.L" "rps6ka4.L"
## [6] "gfm1.L" "ptcd3.L" "snu13.L" "riox2.L" "nsa2.L"
## [11] "mrpl24.L" "top2a.L" "rps6ka3.L" "rps6kb1.L" "mlh3.L"
## [16] "rpl24.L" "rps20.L" "rps7.L" "rps15.L" "rpl34.L"
## [21] "rpl17.L" "rps19.L" "rpl11.L" "rpl35.L" "rpl6.L"
## [26] "rps6kc1.L" "mrps18a.L" "rps12.L" "rps24.L" "rpl7a.L"
## [31] "mrpl4.L" "eprs1.L" "ipo7.L" "rps8.L" "rps6.L"
## [36] "rps11.L" "rpl8.L" "rpl28.L" "rpl27a.L" "rpl22.L"
## [41] "rps27.L" "rps13.L" "rps4x.L" "rpl21.L" "rpl26.L"
## [46] "mrpl45.L" "rps10.L" "mrps16.L" "mrps9.L" "rplp2.L"
## [51] "mrps18b.L" "rpl39.L" "fau.L" "mrpl52.L" "mrps11.L"
## [56] "mrpl20.L" "mrpl12.L" "mrps33.L" "mrpl3.L" "mrpl2.L"
## [61] "rps28p9.L" "rpl36.L" "rpl36a.L" "rpl38.L" "rpl29.L"
## [66] "rpl23a.L" "mrpl51.L" "mrpl28.L" "rps26.L" "rps23.L"
## [71] "mrpl15.L" "yy1.L" "mrps7.L" "rpl19.L" "rps25.L"
## [76] "rps27a.L" "rpl37.L" "rpl30.L" "rps17.L" "galk1.L"
## [81] "uba52.L" "exosc9.L" "rps18.L" "mrps26.L" "rpl31.L"
## [86] "rplp1.L" "hsp90b1.L" "qars1.L" "rpl27.L" "rps14.L"
## [91] "rpl4.L" "rpl18.L" "rpl18a.L" "mrto4.L" "rps2.L"
## [96] "rpl9.L" "rpl15.L" "rpl10.L" "eef2.1.L" "eftud2.L"
## [101] "rpl13a.L" "ckm.L" "rpl3.L" "rplp0.L" "rpl5.L"
## [106] "rack1.L" "exosc8.L" "rpl12.L" "rsl1d1.L" "mrpl38.L"
## [111] "galk2.L" "rpl7l1.L" "rpl22l1.L" "mrpl37.L" "mrps2.L"
## [116] "mrps25.L" "mrpl32.L" "mrps35.L" "mrps30.L" "rpl7.L"
## [121] "mrpl21.L" "rps16.L" "mrpl11.L" "mrpl53.L" "rpl32.L"
## [126] "mrpl1.L" "rsl24d1.L" "imp3.L" "rps5.L" "mrps17.L"
## [131] "mrps12.L" "mrpl18.L" "mrpl43.L" "rpsa.S" "rps6ka6.S"
## [136] "snu13.S" "hsp90ab1.S" "nsa2.S" "rps6ka3.S" "rps6kb1.S"
## [141] "pms1.S" "rpl24.S" "rps20.S" "rps7.S" "rps15.S"
## [146] "rpl34.S" "rpl17.S" "rpl23.S" "rps19.S" "rpl11.S"
## [151] "rpl6.S" "rps15a.S" "rps6kc1.S" "mlh1.S" "rps12.S"
## [156] "mrpl9.S" "mrps36.S" "rpl7a.S" "XB5843130.S" "eprs1.S"
## [161] "ipo7.S" "rps8.S" "rpl35a.S" "rps6.S" "rpl8.S"
## [166] "rpl28.S" "rpl27a.S" "rpl22.S" "rps27.S" "rps13.S"
## [171] "rps4x.S" "rpl26.S" "rps10.S" "mrpl16.S" "rplp2.S"
## [176] "mrpl23.S" "mrps18b.S" "rpl39.S" "fau.S" "rps29.S"
## [181] "mrpl52.S" "mrpl40.S" "mrps11.S" "mrps33.S" "mrpl3.S"
## [186] "rps28p9.S" "rpl36.S" "rpl38.S" "rpl29.S" "rpl23a.S"
## [191] "rps21.S" "rps26.S" "rps23.S" "yy1.S" "mrps7.S"
## [196] "rps25.S" "rpl13.S" "mrpl17.S" "rps27a.S" "rpl37.S"
## [201] "rpl30.S" "rps17.S" "rpl14.S" "nhp2.S" "rpl31.S"
## [206] "hsp90b1.S" "qars1.S" "rpl27.S" "rps14.S" "rpl4.S"
## [211] "mrpl44.S" "rpl18.S" "rpl18a.S" "mrto4.S" "rpl15.S"
## [216] "rpl10.S" "eef2.1.S" "eftud2.S" "rpl13a.S" "rpl10a.S"
## [221] "rplp0.S" "rpl5.S" "rack1.S" "exosc8.S" "rpl12.S"
## [226] "rsl1d1.S" "mrpl48.S" "mrps2.S" "mrps25.S" "mrpl32.S"
## [231] "mrps30.S" "rps16.S" "mrpl11.S" "mrpl13.S" "exosc4.S"
## [236] "exosc7.S" "mrpl1.S" "rsl24d1.S" "imp3.S" "rps5.S"
## [241] "mrps12.S" "mrps34.S"
potential2 <- setdiff( provisional, provisional2 )
potential2
## [1] "37lrp" "lambr" "lamr1"
## [4] "LR" "lrp" "p40"
## [7] "RPSA" "rpsa" "37LRP"
## [10] "67LR" "LamR" "LBP/p40"
## [13] "LRP/LR" "LOC100037080" "rps3-a"
## [16] "rps6ka" "rps3-b" "ogfod1"
## [19] "impact" "top2b.S" "top2b.L"
## [22] "pp90rsk4" "rps6ka1" "rps6ka6"
## [25] "rsk4" "lonp1.L" "LONP1"
## [28] "hu-1" "mapkapk1a" "rsk"
## [31] "rsk1" "rps6ka4" "gfm1"
## [34] "efg1" "rpl4-b" "rpl1b"
## [37] "rpl18-b" "rpl14b" "rpl5-a"
## [40] "rpl5-b" "lonp2" "ptcd3"
## [43] "snu13" "nhp2l1" "hsp90ab1"
## [46] "hsp90beta" "XELAEV_18028538mg" "mina"
## [49] "mina-prov" "mina.L" "NO52"
## [52] "riox2" "cdk105" "hcl-g1"
## [55] "hclg1" "hussy-29" "hussy29"
## [58] "nsa2" "tinp1" "yr-29"
## [61] "mrpl24" "MGC81220" "LOC398512"
## [64] "top2" "top2a" "tp2a"
## [67] "fa-1" "fa1" "hoip"
## [70] "nhp2l1-b" "nhpx" "otk27"
## [73] "snrnp15-5" "spag12" "ssfa1"
## [76] "rpls3-b" "p90" "rps6ka3"
## [79] "RSK2" "rsk2" "S6KII"
## [82] "p70-alpha" "p70-s6k" "p70s6k"
## [85] "ps6k" "rps6kb1" "rps6kb1-A"
## [88] "s6k" "s6K1" "stk14a"
## [91] "pms1" "LOC108703225" "LOC108700925"
## [94] "LOC108700150" "gnb2-rs1" "gnb2l1"
## [97] "h12.3" "hlc-7" "pig21"
## [100] "mlh3" "rps6ka5.L" "L24"
## [103] "rpl24" "rpl24-A" "rps9.L"
## [106] "LOC121394610" "rps20" "pnpt1.S"
## [109] "LOC108717861" "dba8" "rps7"
## [112] "rpS8A" "rpS8B" "eprs"
## [115] "mrpl49.L" "mvd.L" "nhp2.L"
## [118] "LOC108710993" "rps27l.L" "mrpl30.S"
## [121] "mrps15.L" "rig" "rps15"
## [124] "mrpl35.L" "L34" "rpl34"
## [127] "xl34" "mvk.L" "gfm2.L"
## [130] "EFG2" "GFM2" "b"
## [133] "l17" "pd-1" "rpl17"
## [136] "rpl17-a" "rpl17-b" "rpl23"
## [139] "rps19" "rpl11" "l35"
## [142] "rpl35" "LOC108698757" "LOC108698451"
## [145] "MGC114789" "LOC108703484" "pms2.L"
## [148] "rpl6" "rps15a" "rps22"
## [151] "LOC108706570" "mrps31.S" "rps27l.S"
## [154] "rpk118" "rps6kc1" "s6pkh1"
## [157] "mrp-s18-3" "mrps18-3" "mrps18a"
## [160] "s18bmt" "LOC108717877" "hsp90b"
## [163] "mlh1" "mutL" "LOC121393781"
## [166] "LOC121393773" "eprs.S" "rps12"
## [169] "rps12-a" "rps12-b" "rps12b"
## [172] "rps24" "LOC121393051" "comp72"
## [175] "l9mt" "mrpl9" "mrps31.L"
## [178] "LOC100036779" "dc47" "LOC100037103"
## [181] "mrp-s36" "mrps36" "mrps36.L"
## [184] "LOC100037089" "rpl7a" "LOC100049136"
## [187] "XB5843130" "LOC100101334" "rps3a-b"
## [190] "LOC100126642" "cgi-28" "l4mt"
## [193] "mrpl4" "LOC100127328" "LOC100037086"
## [196] "ffcskk.L" "fcsk" "fuk"
## [199] "fuk.L" "LOC100137687" "eprs1"
## [202] "LOC100158438" "imp7" "ipo7"
## [205] "MGC52556" "ranbp7" "rps8"
## [208] "rpl35a" "rpl4-a" "rpl-4"
## [211] "rpl1a" "rpl18-a" "rpl14a"
## [214] "rps6" "rps11" "rpl8"
## [217] "rpl28" "rpl27a" "rpl22"
## [220] "rps27" "rps13" "rps4"
## [223] "rps4x" "rpl21" "rpl26"
## [226] "mrpl45" "rps10" "LOC100158393"
## [229] "mrps16" "mrps9" "mrpl16"
## [232] "mrpl16.L" "MGC154377" "rplp2"
## [235] "LOC734179" "MGC131313" "mrpl23"
## [238] "mrps24-b" "MGC130639" "mrp-s18-2"
## [241] "mrps18-2" "mrps18b" "ptd017"
## [244] "s18amt" "SCL75" "MGC116452"
## [247] "MGC116477" "rpl39" "rpl39-a"
## [250] "rpl39-b" "rpl39a" "rpl39b"
## [253] "MGC116425" "fau" "MGC116435"
## [256] "MGC114875" "rps29" "MGC130892"
## [259] "mrpl52" "MGC115435" "MGC114621"
## [262] "MGC98504" "LOC733189" "MGC115171"
## [265] "LOC496101" "mrpl40" "LOC495996"
## [268] "mrps11" "LOC495474" "mrpl20"
## [271] "LOC495364" "mrpl12" "LOC495310"
## [274] "mrps33" "LOC494992" "mrpl3"
## [277] "rpl3l.L" "LOC494722" "rpl3l"
## [280] "cgi-22" "MGC84466" "mrp-l14"
## [283] "mrpl2" "rpml14" "MGC85550"
## [286] "rps28p9" "rpl36" "LOC108700056"
## [289] "MGC85428" "rpl36a" "rpl36a.S"
## [292] "MGC85404" "rpl38" "l28"
## [295] "LOC100101273" "rpl28-a" "rpl28-b"
## [298] "MGC85384" "rpl29" "rpl29-a"
## [301] "rpl29-b" "rpl29b" "l23a"
## [304] "mda20" "MGC85348" "rpl23a"
## [307] "MGC85310" "mrpl51" "mrpl28"
## [310] "rps21" "MGC86356" "rps26"
## [313] "MGC86316" "rps23" "mrpl15"
## [316] "MGC78885" "rpl17a" "FIII"
## [319] "ino80s" "nf-e1" "ucrbp"
## [322] "xyy1" "yin-yang-1" "yy1"
## [325] "yy1-a" "yy1-b" "mrps7"
## [328] "MGC84358" "rpl19.S" "rpl19"
## [331] "rpl19-prov" "MGC83421" "mrpl41-a"
## [334] "LOC398653" "MGC82151" "rps25"
## [337] "MGC82136" "rpl13" "rpl13-prov"
## [340] "MGC83084" "mrpl17" "MGC81889"
## [343] "rps27a" "MGC82973" "rpl37"
## [346] "l30" "MGC82844" "rpl30"
## [349] "rpl30-a" "rpl30-b" "MGC82841"
## [352] "rps17" "LOC108700787" "MGC82808"
## [355] "galk1" "MGC82807" "rps9.S"
## [358] "MGC80804" "rps9" "MGC80774"
## [361] "MGC80700" "MGC80199" "surf3"
## [364] "trup" "MGC80163" "LOC108706905"
## [367] "MGC80109" "uba52" "exosc9"
## [370] "scl75" "mrpl41-b" "MGC83076"
## [373] "rpl14" "mrps24-a" "LOC108700218"
## [376] "MGC82306" "rps18" "MGC82245"
## [379] "mrp-s13" "mrp-s26" "mrps13"
## [382] "mrps26" "rpms13" "nhp2"
## [385] "nola2" "MGC81290" "rpl31"
## [388] "MGC68562" "rplp1" "ecgp"
## [391] "gp96" "grp94" "hsp90b1"
## [394] "MGC68448" "tra1" "MGC69128"
## [397] "qars" "qars.S" "qars1"
## [400] "MGC68529" "rpl27" "rps14"
## [403] "rps14-prov" "rpl4" "mrpl44"
## [406] "LOC108702869" "MGC64315" "rpl18"
## [409] "rpl29a" "L14B" "LOC121393140"
## [412] "rpl35a.L" "LOC121400351" "rpl37a"
## [415] "MGC64263" "rpl18a" "mrt4"
## [418] "mrto4" "rps2" "rps2e"
## [421] "rpl9" "rpl15" "rpl10"
## [424] "eef-2" "eef2" "eef2.1"
## [427] "ef2" "eftud2" "snrp116"
## [430] "snu114" "rps12a" "rpl13a"
## [433] "RPL17" "ckm" "ckm.S"
## [436] "ckmm" "m-ck" "rpl3"
## [439] "rpl10a" "rps6-a" "rps6-b"
## [442] "rps6b" "rps3a-a" "arbp"
## [445] "l10e" "lp0" "prlp0"
## [448] "rplp0" "rpp0" "rpl5"
## [451] "LOC446289" "rack1" "exosc8"
## [454] "MGC52847" "rpl12" "XL34"
## [457] "mrpl10.S" "trap1.S" "rsl1d1"
## [460] "trap1.L" "mrps10.S" "mrps21.S"
## [463] "exosc5.L" "rps6kl1.L" "hsp90aa1.1.L"
## [466] "LOC108698781" "mrps21.L" "LOC108697549"
## [469] "LOC108697688" "XB5896631.L" "LOC108695345"
## [472] "mrpl55.L" "mrpl57.S" "mrpl33.L"
## [475] "mrpl14.L" "mrps5.L" "mrps14.S"
## [478] "LOC108715766" "LOC108715857" "exosc6.L"
## [481] "rps19bp1.L" "LOC108714734" "mrpl46.S"
## [484] "mrpl42.S" "mrpl46.L" "rpl37a.L"
## [487] "LOC108711439" "LOC108712865" "mrps6.S"
## [490] "mrpl48" "mrpl39.L" "LOC108707957"
## [493] "mrpl54.S" "mrps18c.S" "LOC108712230"
## [496] "LOC108713416" "eef2.2.L" "MGC68699"
## [499] "mrps18c.L" "mrpl38" "LOC108698235"
## [502] "LOC108700022" "mrpl27.L" "rpl3l.S"
## [505] "LOC108707348" "LOC108708365" "mrpl42.L"
## [508] "galk2" "LOC108710319" "rpl7l1"
## [511] "mrps27.L" "mrpl47.L" "mrpl19.S"
## [514] "LOC108715263" "LOC108716102" "mrps22.L"
## [517] "LOC108717404" "LOC108705784" "LOC108717492"
## [520] "LOC121394503" "LOC121395362" "LOC121396125"
## [523] "LOC108705858" "LOC121396648" "LOC121396642"
## [526] "dap3.L" "mrlp45" "LOC108706079"
## [529] "LOC121400386" "LOC121400621" "LOC121400576"
## [532] "LOC108709120" "mrpl22.L" "LOC108712266"
## [535] "LOC108712883" "LOC121402816" "LOC121402815"
## [538] "LOC121393045" "LOC100037062" "rpl22l1"
## [541] "LOC100037111" "LOC733390" "mrpl37"
## [544] "LOC100037184" "PDCD9" "LOC100049095"
## [547] "LOC443704" "mrps2" "LOC100049746"
## [550] "mrps25" "LOC100126614" "LOC100126616"
## [553] "mrpl32" "LOC100127333" "LOC100127340"
## [556] "mrps35" "LOC100137646" "LOC445824"
## [559] "LOC733351" "MGC131350" "mrps30"
## [562] "LOC733400" "MGC130910" "rpl7"
## [565] "LOC733385" "MGC131341" "mrpl21"
## [568] "rps16" "LOC446962" "LOC733302"
## [571] "LOC734162" "LOC496259" "mrpl11"
## [574] "LOC496258" "mrpl13" "LOC496046"
## [577] "exosc4" "LOC495942" "LOC495666"
## [580] "LOC495349" "LOC495212" "MGC85354"
## [583] "mrpl53" "MGC85374" "rpl32"
## [586] "MGC85232" "MGC84749" "eft-2-prov"
## [589] "exosc7" "exosc7-prov" "mrpl1"
## [592] "mrpl1-prov" "MGC81028" "rlp24"
## [595] "rpl24l" "rsl24d1" "rvas3"
## [598] "MGC82344" "imp3" "MGC81216"
## [601] "MGC80065" "rps5" "mrps17"
## [604] "mrps12" "LOC398682" "mrpl18"
## [607] "mrps34" "mrpl43" "RPL18A"
## [610] "pdcd9" "S14" "RACK1"
grep("rpl", rownames(xenopus), value = T) %>% sort()
## [1] "mrpl1.L" "mrpl1.S" "mrpl10.S" "mrpl11.L" "mrpl11.S" "mrpl12.L"
## [7] "mrpl13.S" "mrpl14.L" "mrpl15.L" "mrpl16.S" "mrpl17.S" "mrpl18.L"
## [13] "mrpl19.S" "mrpl2.L" "mrpl20.L" "mrpl21.L" "mrpl22.L" "mrpl23.S"
## [19] "mrpl24.L" "mrpl27.L" "mrpl28.L" "mrpl3.L" "mrpl3.S" "mrpl30.S"
## [25] "mrpl32.L" "mrpl32.S" "mrpl33.L" "mrpl35.L" "mrpl36.L" "mrpl37.L"
## [31] "mrpl38.L" "mrpl39.L" "mrpl4.L" "mrpl40.S" "mrpl41.L" "mrpl41.S"
## [37] "mrpl42.L" "mrpl42.S" "mrpl43.L" "mrpl44.S" "mrpl45.L" "mrpl46.L"
## [43] "mrpl46.S" "mrpl47.L" "mrpl48.S" "mrpl49.L" "mrpl51.L" "mrpl52.L"
## [49] "mrpl52.S" "mrpl53.L" "mrpl54.S" "mrpl55.L" "mrpl57.S" "mrpl58.L"
## [55] "mrpl58.S" "mrpl9.S" "rpl10.L" "rpl10.S" "rpl10a.S" "rpl11.L"
## [61] "rpl11.S" "rpl12.L" "rpl12.S" "rpl13.S" "rpl13a.L" "rpl13a.S"
## [67] "rpl14.S" "rpl15.L" "rpl15.S" "rpl17.L" "rpl17.S" "rpl18.L"
## [73] "rpl18.S" "rpl18a.L" "rpl18a.S" "rpl19.L" "rpl21.L" "rpl22.L"
## [79] "rpl22.S" "rpl22l1.L" "rpl23.S" "rpl23a.L" "rpl23a.S" "rpl24.L"
## [85] "rpl24.S" "rpl26.L" "rpl26.S" "rpl27.L" "rpl27.S" "rpl27a.L"
## [91] "rpl27a.S" "rpl28.L" "rpl28.S" "rpl29.L" "rpl29.S" "rpl3.L"
## [97] "rpl30.L" "rpl30.S" "rpl31.L" "rpl31.S" "rpl32.L" "rpl34.L"
## [103] "rpl34.S" "rpl35.L" "rpl35a.S" "rpl36.L" "rpl36.S" "rpl36a.L"
## [109] "rpl37.L" "rpl37.S" "rpl38.L" "rpl38.S" "rpl39.L" "rpl39.S"
## [115] "rpl4.L" "rpl4.S" "rpl5.L" "rpl5.S" "rpl6.L" "rpl6.S"
## [121] "rpl7.L" "rpl7a.L" "rpl7a.S" "rpl7l1.L" "rpl8.L" "rpl8.S"
## [127] "rpl9.L" "rplp0.L" "rplp0.S" "rplp1.L" "rplp2.L" "rplp2.S"
ribo.genes <- intersect( rownames(xenopus), c(provisional, provisional2) )
ribo.genes
## [1] "mrps26.L" "ptcd3.L" "mrpl35.L" "rpl9.L" "exosc9.L"
## [6] "rpl34.L" "mrpl1.L" "rpl36.L" "eef2.1.L" "LOC108712230"
## [11] "rps15.L" "rps28p9.L" "lonp1.L" "uba52.L" "LOC108713416"
## [16] "rps6.L" "mrps18c.L" "mrpl52.L" "mvk.L" "rplp0.L"
## [21] "rpl6.L" "rps23.L" "mrps27.L" "gfm2.L" "nsa2.L"
## [26] "mrps30.L" "rpl37.L" "rpl17.L" "LOC108706570" "rpl34.S"
## [31] "mrpl1.S" "rpl36.S" "eef2.1.S" "mrpl54.S" "rps15.S"
## [36] "rps28p9.S" "LOC108706905" "rps6.S" "mrps18c.S" "mrpl52.S"
## [41] "rplp0.S" "mrpl40.S" "rpl6.S" "rps23.S" "nsa2.S"
## [46] "mrps30.S" "rpl37.S" "rpl17.S" "mrpl39.L" "riox2.L"
## [51] "rpl23a.L" "rps6ka3.L" "LOC398653" "rps6ka1.L" "rps10.L"
## [56] "rpl8.L" "mrps15.L" "rpl11.L" "rps6kb1.L" "mrps17.L"
## [61] "rpl24.L" "LOC108708365" "rpl31.L" "mrps9.L" "rps26.L"
## [66] "mrps31.L" "exosc8.L" "rpl21.L" "rpl23a.S" "mrps6.S"
## [71] "rps6ka3.S" "XB5843130.S" "rpl8.S" "rps10.S" "rpl10a.S"
## [76] "rps6ka6.S" "LOC121400351" "rpl11.S" "rps6kb1.S" "rpl24.S"
## [81] "rpl31.S" "mrpl30.S" "rps26.S" "mrps31.S" "exosc8.S"
## [86] "MGC80700" "mrpl48.S" "mrps33.L" "mrps35.L" "mrpl22.L"
## [91] "rps14.L" "rpl26.L" "LOC108710993" "hsp90b1.L" "rps16.L"
## [96] "mrpl42.L" "rpl18a.L" "rps27l.L" "rps17.L" "mrpl18.L"
## [101] "rsl24d1.L" "galk2.L" "LOC108711439" "mrps11.L" "mrpl46.L"
## [106] "rpl4.L" "mrpl4.L" "mrps33.S" "rps14.S" "rpl4.S"
## [111] "mrpl46.S" "mrps11.S" "LOC108712865" "LOC108712883" "rsl24d1.S"
## [116] "rps17.S" "rps27l.S" "rpl18a.S" "mrpl42.S" "rps16.S"
## [121] "hsp90b1.S" "nhp2.S" "rpl26.S" "LOC108703484" "rps13.L"
## [126] "ipo7.L" "mrpl21.L" "rplp2.L" "mrpl49.L" "fau.L"
## [131] "rps6ka4.L" "mrpl11.L" "ogfod1.L" "LOC121393045" "mvd.L"
## [136] "exosc6.L" "ffcskk.L" "rps8.L" "mrpl37.L" "rpl5.L"
## [141] "snu13.L" "rpsa.L" "imp3.L" "rpl3.L" "rps19bp1.L"
## [146] "rpl29.L" "rpl32.L" "mrps25.L" "qars1.L" "LOC108714734"
## [151] "mrpl11.S" "fau.S" "rplp2.S" "mrpl23.S" "ipo7.S"
## [156] "rps13.S" "rpl13.S" "rps8.S" "rpl5.S" "snu13.S"
## [161] "rpsa.S" "imp3.S" "LOC108715766" "rpl29.S" "LOC108715857"
## [166] "mrps25.S" "qars1.S" "rps27a.L" "rps6kc1.L" "eprs1.L"
## [171] "mrpl33.L" "LOC108716102" "mrpl14.L" "mrps18a.L" "mrpl2.L"
## [176] "rpl7l1.L" "mrps5.L" "rps12.L" "LOC121393773" "LOC121393781"
## [181] "mrpl47.L" "rpl22l1.L" "gfm1.L" "mrps22.L" "rps7.L"
## [186] "hsp90ab1.S" "eprs1.S" "rps6kc1.S" "mrpl57.S" "rps27a.S"
## [191] "pnpt1.S" "LOC108717404" "LOC108717861" "mrps36.S" "rps12.S"
## [196] "LOC121393140" "LOC108705784" "rpl35a.S" "mrpl44.S" "rps7.S"
## [201] "LOC108717492" "mrpl19.S" "mrpl55.L" "rpl15.L" "top2b.L"
## [206] "LOC121394610" "mrpl3.L" "mrpl32.L" "impact.L" "mrpl15.L"
## [211] "rps20.L" "rpl7.L" "LOC121394503" "mrpl53.L" "rpl30.L"
## [216] "rplp1.L" "rpl15.S" "top2b.S" "rpl14.S" "mrpl3.S"
## [221] "exosc7.S" "mrpl32.S" "mlh1.S" "rps20.S" "LOC108695345"
## [226] "rpl30.S" "mrpl13.S" "MGC114621" "exosc4.S" "mrpl51.L"
## [231] "mrpl43.L" "LOC121393051" "rps24.L" "rpl27a.L" "XB5896631.L"
## [236] "rps25.L" "mrpl20.L" "mrps16.L" "rpl22.L" "rpl18.L"
## [241] "rpl28.L" "rps9.L" "rps19.L" "mrto4.L" "mrpl16.S"
## [246] "LOC108697549" "rpl27a.S" "rps25.S" "rpl22.S" "LOC108697688"
## [251] "rpl18.S" "rpl28.S" "rps9" "rps9.S" "rps19.S"
## [256] "mrto4.S" "rpl10.L" "rpl12.L" "mrps2.L" "rpl7a.L"
## [261] "rpl35.L" "rpl36a.L" "rps4x.L" "rpl39.L" "LOC108698451"
## [266] "ckm.L" "rack1.L" "rps5.L" "rps18.L" "mrps18b.L"
## [271] "exosc5.L" "mrps12.L" "mlh3.L" "rps6ka5.L" "LOC108698757"
## [276] "yy1.L" "LOC108698781" "mrps21.L" "mrpl24.L" "rps27.L"
## [281] "dap3.L" "rpl10.S" "yy1.S" "rps29.S" "mrps10.S"
## [286] "mrpl17.S" "rpl12.S" "mrps2.S" "rpl7a.S" "LOC108700022"
## [291] "rps4x.S" "LOC108700056" "rpl39.S" "LOC108700150" "rack1.S"
## [296] "rps5.S" "LOC108700218" "mrps18b.S" "mrps12.S" "mrps21.S"
## [301] "rps27.S" "mrpl9.S" "top2a.L" "rpl19.L" "LOC108700787"
## [306] "mrpl45.L" "eftud2.L" "LOC108700925" "rpl38.L" "mrpl12.L"
## [311] "rpl27.L" "mrpl27.L" "mrps7.L" "galk1.L" "mrpl38.L"
## [316] "rps11.L" "rpl13a.L" "mrpl28.L" "rsl1d1.L" "MGC114789"
## [321] "pms2.L" "trap1.L" "rps2.L" "rpl23.S" "mrpl10.S"
## [326] "eftud2.S" "rps21.S" "rpl38.S" "rpl27.S" "mrps7.S"
## [331] "LOC108702869" "rpl13a.S" "pms1.S" "rsl1d1.S" "rps15a.S"
## [336] "mrps34.S" "trap1.S" "LOC108703225"
ribo.genes %>% length()
## [1] 338
Save the ribosomal genes that you curated
xenopus[["percent.ribo"]] <- Seurat::PercentageFeatureSet( xenopus, features = ribo.genes )
xenopus@meta.data
Depending on the situation it is helpful to have hemoglobin genes to recognize a particular cell types, the red blood cells. There is a typical convention of naming them, I have followed this to put manually, but make sure they are correct
rbc.genes <-
setdiff(
grep("^hb", rownames(xenopus), value = T),
grep("^hb(egf|ox)", rownames(xenopus), value = T)
)
rbc.genes
## [1] "hbp1.L" "hbp1.S" "hbs1l.L" "hba-l5.L" "hba3.L" "hbg1.L" "hbg2.S"
xenopus[["percent.rbc"]] <- Seurat::PercentageFeatureSet( xenopus, features = rbc.genes )
Save the hemoglobin genes that you have curated
With these three additions, you have essentially meta information for
each cells that you can check out in the Seurat object
@meta.data slot:
xenopus@meta.data
We can do the visualization:
VlnPlot(
xenopus,
features = c(
"nFeature_RNA",
"nCount_RNA",
"percent.mt",
"percent.ribo"
),
ncol = 4,
pt.size = 0
)
VlnPlot(
xenopus,
features = c(
"percent.rbc"
)
)
## Cell-cycle genes
In Seurat tutorial, this is quite ad-hoc from one paper to determine the cell cycle genes:
data('cc.genes')
gene.list.frog <- rownames(xenopus)
# cc.genes$s.genes <- map( cc.genes$s.genes, simpleCap )
# cc.genes$g2m.genes <- map( cc.genes$g2m.genes, simpleCap )
# setdiff( tolower(cc.genes$s.genes), gene.list.frog ) # Mlf1ip is Cenpu
tolower(cc.genes$s.genes) %>% sort()
## [1] "atad2" "blm" "brip1" "casp8ap2" "ccne2" "cdc45"
## [7] "cdc6" "cdca7" "chaf1b" "clspn" "dscc1" "dtl"
## [13] "e2f8" "exo1" "fen1" "gins2" "gmnn" "hells"
## [19] "mcm2" "mcm4" "mcm5" "mcm6" "mlf1ip" "msh2"
## [25] "nasp" "pcna" "pola1" "pold3" "prim1" "rad51"
## [31] "rad51ap1" "rfc2" "rpa2" "rrm1" "rrm2" "slbp"
## [37] "tipin" "tyms" "ubr7" "uhrf1" "ung" "usp1"
## [43] "wdr76"
grep(
paste0( "^(", paste( tolower(cc.genes$s.genes), collapse = "|" ), ")" ),
gene.list.frog,
value = T
) %>% sort()
## [1] "atad2.L" "atad2.S" "atad2b.L" "atad2b.S" "blm.S"
## [6] "blmh.L" "blmh.S" "brip1.L" "ccne2.L" "ccne2.S"
## [11] "cdc45.L" "cdc45.S" "cdc6.L" "cdc6.S" "cdca7.L"
## [16] "cdca7.S" "cdca7l.S" "clspn.S" "dscc1.L" "dtl.L"
## [21] "dtl.S" "e2f8.L" "exo1.S" "fen1.L" "fen1.S"
## [26] "gins2.L" "gmnn.L" "gmnn.S" "hells.L" "hells.S"
## [31] "mcm2.L" "mcm2.S" "mcm4.L" "mcm4.S" "mcm5.L"
## [36] "mcm5.S" "mcm6.2.L" "mcm6.2.S" "msh2.L" "nasp.L"
## [41] "nasp.S" "pcna.L" "pcna.S" "pola1.S" "pold3.L"
## [46] "prim1.S" "rad51.L" "rad51.S" "rad51ap1.L" "rad51c.L"
## [51] "rad51d.L" "rfc2.L" "rpa2.L" "rpa2.S" "rrm1.L"
## [56] "rrm1.S" "rrm2.1.L" "rrm2.2.L" "rrm2b.S" "slbp.L"
## [61] "slbp.S" "tipin.L" "tipin.S" "tyms.L" "ubr7.L"
## [66] "ubr7.S" "uhrf1.L" "uhrf1.S" "uhrf1bp1l.L" "ung.L"
## [71] "usp1.L" "usp1.S" "usp10.L" "usp10.S" "usp12.L"
## [76] "usp12.S" "usp12b.L" "usp12b.S" "usp13.L" "usp14.L"
## [81] "usp14.S" "usp15.L" "usp16.L" "usp16.S" "usp19.L"
## [86] "usp19.S" "wdr76.S"
map_dfr(
cc.genes$s.genes,
function(g) {
tibble(
gene = tolower(g),
match = paste( grep( paste0( "^", tolower(g), "\\." ), gene.list.frog, value = T), collapse = "," )
)
}
)
map_dfr(
cc.genes$g2m.genes,
function(g) {
tibble(
gene = tolower(g),
match = paste( grep( paste0( "^", tolower(g), "\\." ), gene.list.frog, value = T), collapse = "," )
)
}
)
# Two genes are not found
# casp8ap2
# mlf1ip - cenpu
grep("cenpu", gene.list.frog, value = T)
## [1] "cenpu.L"
# grep("ced-", gene.list.frog, value = T) # https://www.xenbase.org/entry/gene/showgene.do?method=displayGeneSummary&geneId=XB-GENE-6257801
# fam64a
# ckap2l
# hjurp
# hn1
# cdca2
# psrc1
grep("pimreg", gene.list.frog, value = T) # fam64a
## [1] "pimreg.L"
grep("ckap2", gene.list.frog, value = T) # ckap2l https://www.xenbase.org/entry/gene/showgene.do?method=displayGeneSummary&geneId=XB-GENE-13579809 not present?
## [1] "ckap2.L" "ckap2.S"
grep("jpt1", gene.list.frog, value = T) # hn1
## [1] "jpt1.L" "jpt1.S"
grep("22068216", gene.list.frog, value = T) # https://www.xenbase.org/entry/gene/showgene.do?method=displayGeneSummary&geneId=XB-GENE-22068215
## character(0)
cc.genes.frog <- cc.genes
cc.genes.frog$s.genes <- c(
grep(
paste0( "^(", paste( tolower(cc.genes$s.genes), collapse = "|" ), ")\\." ),
gene.list.frog,
value = T
),
"cenpu.L"
)
cc.genes.frog$g2m.genes <- c(
grep(
paste0( "^(", paste( tolower(cc.genes$g2m.genes), collapse = "|" ), ")\\." ),
gene.list.frog,
value = T
),
"pimreg.L",
"jpt1.L",
"jpt1.S"
)
cc.genes.frog
## $s.genes
## [1] "slbp.L" "uhrf1.L" "ung.L" "cdc45.L" "slbp.S"
## [6] "uhrf1.S" "cdc45.S" "rpa2.L" "brip1.L" "rfc2.L"
## [11] "rrm1.L" "pold3.L" "pola1.S" "rpa2.S" "clspn.S"
## [16] "prim1.S" "rrm1.S" "tipin.L" "pcna.L" "pcna.S"
## [21] "blm.S" "wdr76.S" "tipin.S" "e2f8.L" "fen1.L"
## [26] "gins2.L" "usp1.L" "nasp.L" "mcm5.L" "mcm2.L"
## [31] "fen1.S" "usp1.S" "nasp.S" "mcm5.S" "mcm2.S"
## [36] "msh2.L" "dtl.L" "rrm2.2.L" "rrm2.1.L" "dtl.S"
## [41] "gmnn.L" "tyms.L" "mcm4.L" "ccne2.L" "dscc1.L"
## [46] "atad2.L" "gmnn.S" "mcm4.S" "ccne2.S" "atad2.S"
## [51] "hells.L" "hells.S" "rad51ap1.L" "ubr7.L" "rad51.L"
## [56] "ubr7.S" "exo1.S" "rad51.S" "cdc6.L" "mcm6.2.L"
## [61] "cdca7.L" "cdc6.S" "mcm6.2.S" "cdca7.S" "cenpu.L"
##
## $g2m.genes
## [1] "tacc3.L" "hmgb2.L" "cenpe.L" "cks2.L" "tacc3.S" "hmgb2.S"
## [7] "cks2.S" "ndc80.L" "cdca8.L" "cbx5.L" "ckap2.L" "ndc80.S"
## [13] "cdca8.S" "birc5.S" "cbx5.S" "ckap2.S" "hmmr.L" "cdc25c.L"
## [19] "gas2l3.L" "tmpo.L" "gtse1.L" "ccnb2.L" "kif23.L" "aurkb.L"
## [25] "kif23.S" "ccnb2.S" "gtse1.S" "tmpo.S" "cdc25c.S" "aurkb.S"
## [31] "ckap5.L" "ctcf.L" "cdc20.L" "kif2c.L" "nuf2.L" "rangap1.L"
## [37] "ctcf.S" "cdc20.S" "kif2c.S" "nuf2.S" "rangap1.S" "cenpf.L"
## [43] "bub1.L" "ect2.L" "smc4.L" "nek2.S" "ect2.S" "anln.L"
## [49] "cdca3.L" "cdk1.L" "mki67.L" "kif11.L" "ncapd2.S" "cdk1.S"
## [55] "mki67.S" "kif11.S" "kif20b.S" "tubb4b.L" "cenpa.L" "dlgap5.L"
## [61] "nusap1.L" "anp32e.L" "cks1b.L" "g2e3.S" "dlgap5.S" "tubb4b.S"
## [67] "nusap1.S" "anp32e.S" "cks1b.S" "top2a.L" "aurka.L" "tpx2.L"
## [73] "ube2c.L" "aurka.S" "tpx2.S" "ube2c.S" "pimreg.L" "jpt1.L"
## [79] "jpt1.S"
Try out below:
# DefaultAssay(xenopus) <- "RNA"
#
# xenopus <-
# CellCycleScoring(
# xenopus,
# s.features = unlist(cc.genes.frog$s.genes),
# g2m.features = unlist(cc.genes.frog$g2m.genes),
# set.ident = F
# )
#
# xenopus@meta.data
Why is the error happening?
xenopus@meta.data %>%
dplyr::slice(sample(1:n())) %>% # just to avoid any artificial "clumping" because of the library
ggplot( aes(x = nCount_RNA, y = nFeature_RNA)) +
# ggplot( aes(x = nUMI, y = nGene, colour=percent.mito)) +
geom_point( alpha = 0.5 ) +
geom_smooth() +
# geom_hline( yintercept = 500, linetype = "dashed", colour="salmon" ) +
# geom_hline( yintercept = 300, linetype = "dashed", colour="blue" ) +
scale_x_continuous(
labels = scales::comma
) +
scale_y_continuous(labels = scales::comma) +
# facet_wrap( library ~ . , scale = "free_x") +
# scale_color_hue(name = "mitochondrial content",
# labels = c(">25%", "<=25%")) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
This looks OK, with some large cells (in terms of RNA content) spread out. around UMI=2500 threshold..
There can be also cells that have lower gene content than the rest, not in this dataset, but showing one example here:
Example UMI Gene relationship
xenopus@meta.data %>%
mutate(
call = case_when(
percent.mt > 20 ~ "mito > 20%",
percent.rbc > 1 ~ "rbc > 1%",
TRUE ~ "pass"
)
) %>%
dplyr::slice(sample(1:n())) %>% # just to avoid any artificial "clumping" because of the library
ggplot( aes(x = nCount_RNA, y = nFeature_RNA, colour=call)) +
# ggplot( aes(x = nUMI, y = nGene, colour=percent.mito)) +
geom_point( alpha = 0.5 ) +
geom_hline( yintercept = 1000, linetype = "dashed", colour="salmon" ) +
scale_x_continuous(
labels = scales::comma
) +
scale_y_continuous(labels = scales::comma) +
scale_color_manual(
name = "outliers",
values = c(
"mito > 20%" = "blue",
"rbc > 1%" = "salmon",
"pass" = "grey"
)
) +
ggtitle(
"Stable relationship between nUMI and nGene"
) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
Again, the same different example showing cases where RBC
contamination is strong.
xenopus@meta.data %>%
dplyr::slice(sample(1:n())) %>% # just to avoid any artificial "clumping" because of the library
mutate(
state = case_when(
percent.mt > 20 ~ "mito(20%+)",
percent.mt > 5 ~ "mito(5%+)",
percent.rbc > 1 ~ "rbc(1%+)",
TRUE ~ "pass"
)
) %>%
ggplot( aes(x = nFeature_RNA, y = percent.mt, colour=state)) +
geom_point( alpha = 0.5 ) +
geom_hline( yintercept = 1, linetype = "dashed", colour="salmon" ) +
geom_hline( yintercept = 5, linetype = "dashed", colour="salmon" ) +
geom_hline( yintercept = 20, linetype = "dashed", colour="navy" ) +
geom_vline( xintercept = 1000, linetype = "dashed", colour="blue" ) +
geom_vline( xintercept = 2500, linetype = "dashed", colour="salmon" ) +
scale_x_continuous(
breaks = c(0, 1000, 2000, 4000, 6000, 8000, 10000, 20000, 30000),
labels = scales::comma
) +
scale_color_manual(
name = "outliers",
values = c(
"mito(20%+)" = "navy",
"mito(5%+)" = "green",
"pass" = "grey",
"rbc(1%+)" = "salmon"
)
) +
ggtitle("Exploration of potential dead cells",
paste0(
"Clear inverse relationship between # of genes and mito content,\n",
"for low nGene small cells"
)
) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
## Ribosmal content
colnames( xenopus@meta.data )
## [1] "orig.ident" "nCount_RNA" "nFeature_RNA" "percent.mt" "percent.ribo"
## [6] "percent.rbc"
xenopus@meta.data %>%
dplyr::filter( percent.mt < 20 ) %>%
ggplot( aes( x = orig.ident, y = percent.ribo ) ) +
geom_violin( scale="width", trim = TRUE )
xenopus@meta.data %>%
ggplot( aes( x = percent.mt, y = percent.ribo ) ) +
geom_point( alpha = 0.2 ) +
# theme(
# axis.text.x = element_text( angle = 45, hjust = 1 )
# ) +
# coord_flip() +
ggtitle("Relationship between ribosomal and mitochondrial content")
xenopus <- NormalizeData(
xenopus,
normalization.method = "LogNormalize",
scale.factor = 10000
)
xenopus <- FindVariableFeatures(
xenopus,
selection.method = "vst",
nfeatures = 2000
)
# Identify the 10 most highly variable genes
top10 <- head(VariableFeatures(xenopus), 10)
top10
## [1] "LOC108714608" "LOC121398924" "LOC121393091" "LOC121394899" "itln1.L"
## [6] "otogl2.L" "gfus.L" "XB5922676.S" "LOC108702929" "gpx3.S"
# plot variable features with and without labels
plot1 <- VariableFeaturePlot(xenopus)
plot2 <- LabelPoints(plot = plot1, points = top10, repel = TRUE)
## When using repel, set xnudge and ynudge to 0 for optimal results
plot1 + plot2
top10
## [1] "LOC108714608" "LOC121398924" "LOC121393091" "LOC121394899" "itln1.L"
## [6] "otogl2.L" "gfus.L" "XB5922676.S" "LOC108702929" "gpx3.S"
Next, we apply a linear transformation (‘scaling’) that is a standard pre-processing step prior to dimensional reduction techniques like PCA. The ScaleData() function:
xenopus[["RNA"]]@scale.data# Conduct scaling to everything
tictoc::tic()
xenopus <- ScaleData(
xenopus,
features = rownames(xenopus),
vars.to.regress = "percent.mt"
)
## Regressing out percent.mt
## Centering and scaling data matrix
tictoc::toc()
## 397.395 sec elapsed
tictoc::tic()
xenopus <- RunPCA(xenopus, features = VariableFeatures(object = xenopus))
## PC_ 1
## Positive: XB5922676.S, tuba1cl.3.L, LOC121397893, LOC121394899, LOC108702929, dynlrb2.S, tuba1cl.2.S, LOC108714608, tekt2.S, ccdc63.L
## dnali1.S, LOC121393091, LOC100137623, ak1.L, ak1.S, LOC108720114, LOC108719383, odf3.L, tubb4b.L, tuba1cl.2.L
## meig1.S, odf3.S, dynlt2b.L, fam166c.S, enkur.L, cfap45.S, LOC121399427, selenow2.L, rsph1.L, pifo.S
## Negative: LOC108709680, tmsb4x.L, XB22164552.S, pfn1.L, krt19.L, LOC108698272, ly6g6c.L, s100a10.S, s100a10.L, slit2.S
## LOC121395537, anxa1.2.S, acta2.S, mmp1.S, mt4.L, ets2.L, marcks.L, LOC121398924, LOC121395448, cldn6.2.S
## arpc5.S, olfm4.L, acta2.L, COX3, mmp8.S, LOC108699645, fkbp11.L, btg2.S, LOC108697796, XB22065621.L
## PC_ 2
## Positive: atp6v0c.S, atp6v1al.L, atp6v0c.L, ca2.S, atp6v1g3.L, atp6v1g3.S, atp6v1b1.L, ca2.L, atp6v0e1.L, txn.L
## slc26a4l.L, atp6v0d2.L, atp6v0b.S, atp6v1d.S, atp6v1f.L, atp6ap1.1.S, atp6v0e1.S, hspe1.L, cycs.S, atp6ap1.1.L
## foxi1.L, LOC398702, cycs.L, hspe1.S, atp6v0a4.L, gsta1.S, wfdc2.L, cox7a2.L, fth1.1.S, atp5mc3.S
## Negative: LOC108709680, tmsb4x.L, krt19.L, pfn1.L, LOC108698272, ly6g6c.L, XB22164552.S, s100a10.S, slit2.S, s100a10.L
## anxa1.2.S, LOC121395537, acta2.S, tuba1cl.3.L, mmp1.S, gby.L, LOC108702929, btg2.S, cfap45.S, LOC121397893
## XB5922676.S, tuba1cl.2.S, LOC121394899, LOC121395448, ets2.L, LOC121393091, acta2.L, LOC108714608, ccdc63.L, olfm4.L
## PC_ 3
## Positive: tmsb4x.L, krt19.L, pfn1.L, atp6v1g3.L, ca2.L, atp6v1al.L, atp6v1g3.S, azin2.S, atp6v1b1.L, ca2.S
## LOC108709680, atp6v0c.L, atp6v0b.S, atp6v0d2.L, atp6v1d.S, slc26a4l.L, atp6v0c.S, atp6ap1.1.S, atp6v0e1.S, foxi1.L
## atp6v1f.L, atp6ap1.1.L, atp6v0a4.L, LOC108719387, cystm1.S, tbc1d24.2.L, hspe1.S, atp5mc3.S, LOC108704370, cycs.S
## Negative: otogl2.L, LOC108699763, fucolectin.S, LOC108719453, itln1.L, LOC108696889, sult6b1.5.L, atp12a.L, LOC108696890, MGC68910
## atp1b2.S, LOC108697896, XB5953580.L, ldhb.S, LOC108700425, agr2.L, tll2l.L, LOC121397762, MGC84752, sytl1.S
## upk1a.S, crisp1.7.L, LOC108699644, XB5774338.L, LOC100158288, psca.S, LOC108699649, capn9.L, upk3a.L, gfus.L
## PC_ 4
## Positive: LOC108697796, LOC121398924, ano1.L, camk1.L, kcna4.S, LOC108713813, mal2.S, mal.L, foxa1.L, pou3f1.S
## emx2.L, spdef.S, atp1b1.L, fkbp11.L, LOC108696980, galnt6.2.L, gpx3.S, dut.S, elapor1.S, LOC108697876
## pts.L, rtp3a.2.L, ATP6, elovl7.S, krt18.1.S, ND3, nans.S, sars1.S, XB5717875.L, LOC108696984
## Negative: tmsb4x.L, ly6g6c.L, XB22164552.S, krt19.L, pfn1.L, LOC108709680, LOC108698272, s100a10.S, s100a10.L, azin2.S
## LOC121395537, anxa1.2.S, mmp1.S, slit2.S, mt4.L, acta2.S, cldn6.2.S, btg2.S, arpc5.S, LOC108699649
## LOC121397602, LOC121395448, LOC108699645, ets2.L, olfm4.L, ctnnb1.L, atp1b2.S, acta2.L, aldob.L, XB5953580.L
## PC_ 5
## Positive: gpx3.S, krt18.1.S, XB22065621.L, krt18.1.L, marcks.L, fn1.S, marcks.S, actc1.S, tnn.L, hoxc10.L
## LOC108709895, prmt1.L, XB5768883.L, pcdh8.2.L, rdd4.L, vim.L, XB5733233.S, LOC108696924, mycn.L, actc1.L
## cyyr1.L, fzd7.L, LOC108701391, cst3.L, col2a1.L, atp5mc3.L, LOC108704022, marcksl1.S, twist1.L, atp5mc3.S
## Negative: LOC121398924, LOC108698272, LOC108697796, camk1.L, ano1.L, LOC108709680, kcna4.S, slc26a4.3.S, LOC108713813, mal2.S
## slc26a4.3.L, XB22164552.S, ca12.L, fetub.S, slc16a3.L, atp1b1.L, mal.L, s100a10.S, foxa1.L, LOC108703568
## krt19.L, ndfip2.L, emx2.L, spdef.S, pfn1.L, galnt6.2.L, atp6v1b2.S, pou3f1.S, LOC108697862, psca.L
tictoc::toc()
## 11.219 sec elapsed
# Examine and visualize PCA results a few different ways
print(xenopus[["pca"]], dims = 1:5, nfeatures = 5)
## PC_ 1
## Positive: XB5922676.S, tuba1cl.3.L, LOC121397893, LOC121394899, LOC108702929
## Negative: LOC108709680, tmsb4x.L, XB22164552.S, pfn1.L, krt19.L
## PC_ 2
## Positive: atp6v0c.S, atp6v1al.L, atp6v0c.L, ca2.S, atp6v1g3.L
## Negative: LOC108709680, tmsb4x.L, krt19.L, pfn1.L, LOC108698272
## PC_ 3
## Positive: tmsb4x.L, krt19.L, pfn1.L, atp6v1g3.L, ca2.L
## Negative: otogl2.L, LOC108699763, fucolectin.S, LOC108719453, itln1.L
## PC_ 4
## Positive: LOC108697796, LOC121398924, ano1.L, camk1.L, kcna4.S
## Negative: tmsb4x.L, ly6g6c.L, XB22164552.S, krt19.L, pfn1.L
## PC_ 5
## Positive: gpx3.S, krt18.1.S, XB22065621.L, krt18.1.L, marcks.L
## Negative: LOC121398924, LOC108698272, LOC108697796, camk1.L, ano1.L
VizDimLoadings(xenopus, dims = 1:2, reduction = "pca")
ElbowPlot(xenopus)
xenopus <- FindNeighbors(xenopus, dims = 1:5)
## Computing nearest neighbor graph
## Computing SNN
xenopus <- FindClusters(xenopus, resolution = 0.1)
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 4969
## Number of edges: 148042
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9605
## Number of communities: 6
## Elapsed time: 0 seconds
# If you haven't installed UMAP, you can do so via reticulate::py_install(packages =
# 'umap-learn')
xenopus <- RunUMAP(xenopus, dims = 1:5)
## Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
## To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
## This message will be shown once per session
## 20:22:06 UMAP embedding parameters a = 0.9922 b = 1.112
## 20:22:06 Read 4969 rows and found 5 numeric columns
## 20:22:06 Using Annoy for neighbor search, n_neighbors = 30
## 20:22:06 Building Annoy index with metric = cosine, n_trees = 50
## 0% 10 20 30 40 50 60 70 80 90 100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 20:22:07 Writing NN index file to temp file /var/folders/qp/vf8kcj3d33q8rcd916m21zwh0000gn/T//RtmpXBfLCJ/fileb52b2ddf1fdc
## 20:22:07 Searching Annoy index using 1 thread, search_k = 3000
## 20:22:09 Annoy recall = 100%
## 20:22:09 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
## 20:22:10 Initializing from normalized Laplacian + noise (using irlba)
## 20:22:10 Commencing optimization for 500 epochs, with 195330 positive edges
## 20:22:17 Optimization finished
DimPlot(xenopus, reduction = "umap")
xenopus@meta.data
# install fast differential gene expression
devtools::install_github("immunogenomics/presto")
DefaultAssay( xenopus ) <- "RNA"
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
group_by( group ) %>%
arrange( group, pct_out - pct_in) %>%
#filter( row_number() <= 5 ) %>%
dplyr::filter( group == 2 )
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
dplyr::filter( grepl("txn.L", feature)) # ionocyte = cluster == 1
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
dplyr::arrange( desc(auc) ) %>%
dplyr::filter( grepl("foxa1", feature)) # small secretory cell = cluster == 2
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
dplyr::filter( feature == "itln1.L") # goblet cells = cluster == 3
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
dplyr::filter( grepl("tekt2.S", feature)) # multi ciliated cell = cluster == 4
presto::wilcoxauc( xenopus, 'RNA_snn_res.0.1' ) %>%
dplyr::filter( grepl("gpx3.S", feature)) # basal = cluster == 5
Traditional way of finding marker genes (as an example):
# find all markers of cluster 2
cluster5.markers <- FindMarkers(xenopus, ident.1 = 5, min.pct = 0.25)
## For a more efficient implementation of the Wilcoxon Rank Sum Test,
## (default method for FindMarkers) please install the limma package
## --------------------------------------------
## install.packages('BiocManager')
## BiocManager::install('limma')
## --------------------------------------------
## After installation of limma, Seurat will automatically use the more
## efficient implementation (no further action necessary).
## This message will be shown once per session
head(cluster5.markers, n = 5)
The below is a boiler plate to check the expression pattern
FeaturePlot(
xenopus,
features = c("gpx3.S", "prmt1.S"),
order = T
)
# new.cluster.ids <- c(
# 1 = "ionocyte",
# 2 = "small secretory",
# 3 = "goblet",
# 4 = "multi-ciliated",
# 5 = "basal"
# )
xenopus$annotation <- case_when(
xenopus$RNA_snn_res.0.1 == 1 ~ "ionocyte",
xenopus$RNA_snn_res.0.1 == 2 ~ "small_secretory",
xenopus$RNA_snn_res.0.1 == 3 ~ "goblet",
xenopus$RNA_snn_res.0.1 == 4 ~ "multi_ciliated",
xenopus$RNA_snn_res.0.1 == 5 ~ "basal",
xenopus$RNA_snn_res.0.1 == 0 ~ "early_epithelial_progenitor",
TRUE ~ "ambiguous"
)
DimPlot(
xenopus,
group.by = "annotation",
label = T
) +
theme( legend.position = "bottom" )
grep("dll", rownames(xenopus), value = T)
## [1] "dll1.L" "dll1.S"
FeaturePlot(
xenopus,
features = c(
# "htr3a.L", # serotonin receptor
"notch2.L",
"notch2.S",
"notch1.L",
"notch1.S"
),
order = T
)
FeaturePlot(
xenopus,
features = c(
"dll1.L",
"dll1.S"
),
order = T
)
You save any variable that is important. Explicitly selecting variables that will be kept for future will be helpful rather than saving the current environment as is.
tictoc::tic()
saveRDS( xenopus, file=glue::glue("{project.prefix}xenopusobject.rds"))
tictoc::toc()
## 33.331 sec elapsed
Check first what we have for the bulk RNA-seq dataset:
cpm <- read_tsv("../ChungKwon2014.XENLA_rfx2mo_exp/Chung2014.cpm_table.tsv")
## Rows: 42675 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (1): ID
## dbl (4): ctrlA, ctrlB, rfx2moA, rfx2moB
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cpm
Need to check whether the gene names are inter-operatable.
intersect( rownames(xenopus), cpm$ID ) %>% length()
## [1] 17509
setdiff( cpm$ID, rownames(xenopus) ) %>% length()
## [1] 25166
setdiff( rownames(xenopus), cpm$ID ) %>% length()
## [1] 5
There are 5 genes that are present in the scRNA-seq data based on XenLae10.1 that is NOT present in the CPM data:
setdiff( rownames(xenopus), cpm$ID )
## [1] "XFG 5-1" "ccdc50.L-1" "unassigned-gene-2"
## [4] "unassigned-gene-4" "unassigned-gene-23"
Shall we “normalize” the number of references across annotations? This might be necessary if there is very unqual distribution of gene expression.
xenopus@meta.data %>%
dplyr::count( annotation )
Except for the early epithelial progenitor annotation, most have comparable number of cells, so will not normalize the reference.
# Extract the count matrix from the scRNA-seq
tictoc::tic()
reference <- as.data.frame(as.matrix( GetAssayData( xenopus, slot = "counts" ) ))
tictoc::toc()
## 5.09 sec elapsed
ncol(reference)
## [1] 4969
nrow(xenopus@meta.data)
## [1] 4969
reference <- reference[intersect( rownames(xenopus), cpm$ID ), ]
reference %>% rownames_to_column("Genesymbol") %>%
dplyr::select( "Genesymbol", everything() ) %>%
write_tsv(
file= glue::glue("{project.prefix}reference.txt")
)
tictoc::toc()
# tictoc::tic()
# write.table(
# reference,
# file=glue::glue("{project.prefix}reference.txt"),
# sep = "\t",
# quote=FALSE,
# row.names = TRUE,
# col.names = TRUE
# )
# tictoc::toc()
Upload the reference file (that has the Genesymbol column) and upload as a single cell reference matrix file.
Uploading your scRNA-seq count matrix 1
Then generate the
signature matrix.
If it works well, it will
generate a heat table, representative matrix.
Result of signature matrix generation
Then you can conduct the cell fraction inferences:
That will result in estimation of the cell type fractions that you can
download.
Plot title.
You can notice that the controls have about 18% of multi-ciliated cells, and some fractions for goblets, but in the morpholino case, these cell populations are gone.
Show the slides.
See the Youtube video! (Enrichr)